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(57) Abstract 

The present invention is related to an isolated and 
purified polypeptide which amino acid sequence presents 
more than 70 % with the sequence SEQ ID NO 1 . The 
present invention is also related to the nucleotide sequence 
encoding said amino acid sequence, the inhibitor directed 
against said sequences and their use in the diagnosis, 
treatment and/or prevention of lung injuries or diseases 
and oxidative stress- related disorders. 



NUCLEOTIDE SEQUENCE ENCODING SAID POLYPEPTIDE AND 
THE TREATMENT OF LUNG INJURIES AND DISEASES, AND OF 



CLOSTAl V alignment o£ human and rat 618 umiw acid saquencti (Identity: 
901, Homology: 97.54): 

BlBhum MAP I KVG DAI P AVEVF EG E P GH KVN LAEXFKGK KGVL FGVPGA FT PGC S K - SEQIDNOl 

BISrat MAPIKVGDTIPSVEV^EGEPGKKVNLAELFKDKKGVLFGVPGAFTPGCSK 

BIShum THLPG FVEQAEALKAKGVQ WAC hSVS DAFVTG EWG RAHKAEGKVRLIAD 

BISrat T HI»P G FVEQAGALKAXGAQWAC LS VNDVFVTAEWGRAHOAEG KVQ L>LAD 

BIShum PTGAFGKETDII.LDDSLV5 1 FGNRRLKRFSKWQDGI VXALMVEPOGTGL 

BISrat PTGAFGKETDLliLDDSLVSLFGNRRlJCRFSMVIDKGVVKALMVEPOGTGl, 

BIShum TCSIAPNIISQL ' 

Bie»t TCSIAPNIL3QL 



CLU5TA1* V alignment of human and mouse BIB amino acid sequences (Identity: 
911, Homology: 961): 



BlBhum MAP I KVGDAJ PAVEVFEGEPGNKVNIJ^rXGKKGVLFGVPGAf-r PGCSK 

BIBmouse MAPIKVGDAIPSVEVFEGEPGKKVHIAEXF^GKKGVLFG^GAFTP^ 

BIShum THLPGFVEQAEAXKAXGVQVVACLSVHnAEVrGEWGRAHKAEGKVRLLAD 

B 1 Smou s e THLPGFVEOAGALKAKGAOWAC LSVNDVFVI EEWG ^Q^GKVRLLAD 

BIShum PTGAFGKETDLLLDDS LVS I FGN RR LKRFS KWQDG I VKAXHVEP D GTGI* 

BISmouse PTGAFGKATDLLLDDSLVSLFGMRRI.KRFSMVI DNGIVKAUrvTEPDGTGL 



BlBhum TCSIAPHIISQL 
BISmouse TCSLAP NILSQL 



CLUSTAX V alignment: of human and rat cDNA sequences (identity: 

Bl Bhum G C CAGGAG G CGGAGTGG AAG TGGCCGTGGGG CGGGT ATG GGACTAGCT G G 

BIS rat TG CGTC CTAGGCAG 

31 8 hum C GT GT GC G C CCT GAG AC G CT CAG CG G GCT AT AT ACT CGTCGGTGGGGCCG 

318 rat CATA GCC GGA T C GGT GCT C C GT GCATC GGCTACTTGGAC - — 

B 1 8hum GC GGTCAGTCT GC GGCAGC G GCAGCAAGAC GGTGCAGTGAAGGAGAGTGG 

318rar GTGCGTGGCAGGCAGAGCAGGCCGG- - - AAAGGAGCAGGTT GG 
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The peroxisomal respiratory pathway is based 
upon the formation of hydrogen peroxide by a collection of 
oxidases and the decomposition of the H 2 0 2 by catalase. 
These reactions ■ are responsible for 20% of oxygen 
5 consumption in liver, and several oxidases have been 
identified in peroxisomes. Ethanol elimination via catalase 
in peroxisomes may be significant in addition to the 
oxidation via cytosolic alcohol dehydrogenase. 

The peroxisomal beta-oxidation system 
10 catalyses the beta-oxidative chain shortening of a specific 
set of compounds which can not be handled by mitochondria : 
very long chain fatty acids, di- and trihydroxycholestanoic 
acids, prist anic acid, long chain dicarboxylic acids, 
several prostaglandins, several leukotrienes, 12- and 15- 
15 bydroxyeicosatetraeonic acid, and several mono- and 
polyunsaturated fatty acids, which are of direct diagnostic 
relevance for some peroxisomal disorders. 

Peroxisomes play also a major role in the 
synthesis of cholesterol and other isoprenoids. Fibroblasts 
2 0 from patients affected by disorders of peroxisome 
biogenesis show low capacity to synthesise cholesterol. 

Two enzyme activities responsible for 
introduction of the characteristic ether linkage in ether- 
1 inked phosphol ipids ( dihydroacetonephosphat e 

2 5 acyltransf erase (DHAPAT) and alkyldihydroxyacetonephosphate 

synthase (alkyl-DHAP synthase)) are localised in 
peroxisomes. These enzymes are not yet cloned. As 
demonstrated by the identification of patients with 
deficiency of either DHAPAT or alkyl-DHAP synthase with 

3 0 severe clinical abnormalities, ether-phospholipids are of 

major importance in humans. 



s 
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Peroxisomes are able to detoxify glyoxylate 
via alanine/glyoxylate aminotransferase. The deficiency of 
this cloned enzyme causes hyperoxaluria type I. 
L-pipecolate is a minor metabolite of L-lysine and i 
5 catabolised by the L- P i P ecolate oxidase localised in 
peroxisomes. The enzyme is deficient in cerebro-hepato- 
renal (Zellweger) syndrome. 

In human, the importance of peroxisomes was 
emphasised by a number of inherited diseases involving 
10 either a defect in the biogenesis of peroxisomes or a 
deficiency of one (or more) peroxisomal enzymes. So far, ■ 12 
different peroxisomal disorders have been described and 
most of them are lethal . 

A wide variety of chemicals have been showed 
15 to produce peroxisome proliferation and induction of 
peroxisomal and microsomal fatty acids-oxidising enzymes 
activities in rats and mice. Several peroxisomes 
proliferators have been shown to increase the incidence of 
liver tumours in these species. Proposed mechanisms of 
20 liver tumour formation by peroxisomes proliferators include 
induction of sustained oxidative stress. 

Therefore, newly identified molecules 
associated with peroxisomes could be used for the 
development of diagnostic tools and possibly for the 
25 improvement of several therapeutical applications of 
various diseases associated with peroxisomal disorders. In 
addition, it is useful to identify the molecules present in 
specific organs like the lung and which may be used as 
specific markers of inflammatory diseases as well as lung 
30 injuries or diseases. 
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Summary of the invention 

The Inventors have isolated and purified a 
new sequence of a low molecular weight human broncho- 
alveolar polypeptide. Said mammal, preferably human, 
5 protein or polypeptide (hereafter identified as B18hum 
protein) has been sequenced and its corresponding genomic 
DNA (SEQ ID NO 8) and cDNA (SEQ ID NO 1) have been 
identified. Similarly, the corresponding nucleotide and 
amino acid sequence from a rat (SEQ ID NO 3 and 4) and from 
10 a mouse (SEQ ID NO 5 and 6) have been obtained. 

Said sequences present several homologies 
with other peroxisomal proteins of yeast and comprise a 
carboxy- terminal tripeptide SQL which is necessary for the 
specific targeting and translocation of several proteins 
15 * into the peroxisome. 

Therefore, the present invention is related 
to a new isolated and purified polypeptide sequence having 
a amino acid sequence which presents more than 70% 
homology, advantageously more than 85% homology, more 
20 preferably more than 95% homology, with the amino acid 
sequence SEQ ID NO 2 .; Said amino acid sequence is 
advantageously obtained from a mammal, preferably from a 
rat, a mouse or a human. 

The present invention is also related to the 
25 isolated and purified polypeptide sequence corresponding to 
the amino acid sequence SEQ ID NO 2 or a portion thereof, 
preferably an immunoreactive portion (putative immunogenic 
domain or T or B cell epitopes) . 

Said portions are advantageously comprised 

3 0 between : 

Glutamic acid position 13 - Glutamic acid position 27 
Alanine position 26 - Leucine position 36 
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Alanine position 42 - Glutamic acid position 57 
Glutamic acid position 57 - Valine position 69 
Valine position 80 - Leucine position 97 
Arginine position 95 - Leucine position 112 
5 - Serine position 118 - Serine position 129 

Valine position 13 7 - Threonine position 15 0 

Preferably, said portion has more than 10 , 
20, 30, 50 or 70 amino acids. Specific portions of the 
amino acid sequence SEQ ID NO 2 are also portions of more 
10 than 70 amino acids which present at least 80% of the 
proteinic activity (see example 5) of the complete SEQ ID 
NO 2 sequence. Therefore, the amino acid sequence according 
to the invention can be partially deleted while maintaining 
its activity, preferably its ant i -oxidative activity, which 
15 will be described hereafter. 

According to the invention, the amino acid 
sequence SEQ ID NO 2 presents a pi of 7.16 and a molecular 
weight of 17047 Dalton as hereafter defined by 
bidimensional electrophoresis . 
20 The present invention is also related to the 

nucleotide sequence enc/oding the amino acid sequence 
according to the -invention and its regulatory sequences 
upstream said coding sequence. A nucleotide sequence 
encoding the polypeptide according to the invention is a 
25 genomic DNA (see SEQ ID NO 10) , a cDNA (see SEQ ID NO 1) or 
a mRNA, possibly comprising said upstream regulatory 
sequence. Advantageously, said nucleotide sequence presents 
more than 70% , advantageously more than 85% , more 
preferably more than 95% homology with SEQ ID NO 1 or its 
3 0 complementary strand . 
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According to a preferred embodiment of the 
present invention, said nucleotide sequence corresponds to 
the nucleotide sequence SEQ ID NO 1 , its complementary 
strand or a portion thereof. 
5 "A portion of the nucleotide sequence SEQ ID 

NO 1" means any nucleotide sequence of more than 15 base 
pairs (such as a primer, a probe or an antisense nucleotide 
sequence) which allow the specific identification, 
reconstitution or blocking of the complete nucleotide 
10 sequence SEQ ID NO 1 (including its regulatory sequences 
upstream the coding sequence) 

Said portions allow the specific 
identification, reconstitution or blocking by specific 
hybridisation with the nucleotidic sequence SEQ ID NO 1, 
15 preferably under standard stringent conditions, with 
sequences like probes or primers possibly labelled with a 
compound (radioactive compound, enzyme, fluorescent marker, 
etc.), and can be used in a specific diagnostic or dosage 
method like probe hybridisation (see Sambrook et al . , §§ 
20 9.47-9.51 in Molecular Cloning : A Laboratory Manual, Cold 
Spring Harbor, Laboratory Press, Cold Spring Harbor, New 
York (1989) ) , genetic amplification (like PCR (US patent 
4,683,195), LCR (Wu et al . , Genomics 4, pp. 560-569), CPR 
(US patent 5,011,769)). 

25 Exemplary stringent hybridisation conditions 

are as follows : hybridisation at 42 o c in 50% formamide 5x 
SSC, 2 0 mM sodium phosphate, pH 6 . 8 washing in 0 . 2x SSC at 
55 o c . It is und e rs tood by those skilled in the art that 
variation of these conditions occur based on the length and 

3 0 GC nucleotide content of the sequence to be hybridised. 
Formulas standard in the art are appropriated f or 



W<3 99/09054 PCT/BE98/00124 



determining exact hybridisation conditions (see Sambrook et 
al . 

Preferred examples of said nucleotide 

portions are as follows : 
5 Sequence Position 

5' -gccatcccagcagtggaggtgtttg-3' (SEQ ID NO 11) 217-241 
5' -ttgaacagctctgccaggttcacc-3' (SEQ ID NO 12) 261-284 

5' -tggaggtgtttgaaggggagccag-3' (SEQ ID NO 13) 230-253 

5' -caggttcaccttgttccctggctc-3' (SEQ ID NO 14) 247-270 

10 5 ' -gggtatgggactagctggcg-3 ' (SEQ ID NO 15) 33-52 

5 r -ctggccaacattccaattgcag-3' (SEQ ID NO 16) 747-768 

and the sequences of respectively 601 (SEQ ID NO 8), 604 
( SEQ ID NO 9 ) and 4 6 9 ( SEQ ID NO 7 ) base pairs 
corresponding to specific mRNA alternative splicing of the 

15 B18 human nucleotide sequence as described in Figure 4 (the 
known genomic sequence incorporating several introns and 
exons is represented in the sequence SEQ ID NO 10) . 

Said sequences may be used for a genetic 
amplification or a probe hybridisation as above-described. 

2 0 The present invention is also related to a 

vector comprising the necpssary elements for the injection, 
transfection or transduction of cells and having 
incorporated one or more of the nucleotide sequences 
according to the invention. The vector according to the 

25 invention is selected from the group consisting of viruses, 
plasmids, phagemides, cationic vesicles, liposomes or a 
mixture thereof. Said vector may comprise also one or more 
adjacent regulatory sequences (such as promoter(s), 
secretion and termination signal sequence (s) ) , 
30 advantageously operably linked to the nucleotide sequence 
according to the invention. 
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The present invention is also related to the 

cell transformed by said vpri-nr = ^ 

y fdia vector and expressing the 

polypeptide according to the invention. 

The nucleotide sequence according to the 
5 invention can be also introduced in said cell by the 
formation of CaP0 4 -nucleic acid precipitate, DEAE-dextran- 
nucleic acid complex or by electroporation . 

Another aspect of the present invention is 
related to an inhibitor of the polypeptide according to the 
invention or the nucleotide sequence according to the 
invention (including the upstream sequences like promoter- 
operator regulatory sequence which may be inhibited by a 
cis- and/or transact ivating repressor) . Said inhibitor is 
advantageously an antibody or a fragment of said antibody 
15 such as an hypervariable portion of said antibody directed 
against the amino acid or nucleotide sequence of the 
polypeptide according to the invention. Other examples of 
inhibitors according to the invention are antisense 
nucleotide sequences which allow the blocking of the 
20 expression of the nucleotide sequence according to the 



10 



invention. 



. Another aspect of the present invention is 
related to a diagnostic device (such as a diagnostic kit or 
a chromatographic column) comprising an element selected 

25 from the group consisting of the amino acid sequence of 
said polypeptide, its nucleotide sequence, and/or the 
inhibitor according to the invention or a fragment thereof 
as above-described. Said diagnostic device may comprise 
also necessary reactants and media for the diagnostic 

30 and/or dosage of the nucleotide and/or amino acid sequence 
of the polypeptide according to the invention, which are 
based upon the method selected from the group consisting of 
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in situ hybridisation, hybridisation by labelled 
antibodies, especially RIA (Radio Immuno Assay) or ELISA 
(Enzymes Linked Immuno -Sorbent Assay) technologies, 
detection upon filter, upon solid support, in solution, in 
5 sandwich, upon gel, dot blot hybridisation, Northern blot 
hybridisation, Southern blot hybridisation, isotopic or 
non-isotopic labelling (by immunofluorescence or 
biotinilised probes) , genetic amplification, (especially by 
PCR or LCR) , double immunodiffusion technique, counter- 
10 electrophoresis technique, haemagglutination or a mixture 
thereof. 

Another aspect of the present invention 
concerns a diagnosis method wherein a biological sample 
from the patient, such as cephalo-rachidian fluid, serum, 
15 blood, plasma, urine, broncho-alveolar lavage, stomach 
lavage, etc., is isolated from the patient, and is put in 
contact with the diagnostic device according to the 
invention for the diagnosis or the monitoring of an injury 
or a disease, preferably a lung injury or an oxidative 
20 stress-related disorder, affected by the presence of pro- 
oxidant agent or oxidative stress such as specific cardio- 
vascular diseases like arteriosclerosis, neurodegenerative 
disorders (Alzheimer 1 s disease, Parkinson 1 s disease, 
amyotrophic lateral sclerosis) , apoptosis, inflammatory 

25 reactions, allergic reactions such as asthma, hay fever and 
eczema, high bone mass syndrome, osteopetrosis, 
osteoporosis -pseudoglioma syndrome, and Bardet-Biedl 
syndrome 1. Said diagnosis and monitoring upon one or more 
biological samples obtained from several tissues from the 

30 patient can be advantageously obtained by one or more of 
the methods above -described, which could be adapted 
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according to the specific bioiogical sample by che 
skilled in the art. 

Therefore, the product according to the 
invention could be used as a mark er for the above- 
5 identified injuries, diseases or disorders in a b^oad 
spectrum of tissues as shown in the enclosed Fi gure 2 . 

A further aspect of the present invention is 
related to a pharmaceutical composition comprising a 
pharmaceutical^ acceptable carrier and an element selected 
10 from the group consisting of the nucleotide sequence, the 
amino acid sequence of the polypeptide according to the 
invention, the inhibitor directed against said sequences 
and/or one or more portions thereof. 

A last aspect of the present invention is 
IS' related to the use of the pharmaceutical composition 
according to the invention for the manufacture of 
medicament in the treatment and/or the prevention of lung 
injuries and/or diseases or of oxidative stress-related 
disorders . 

20 The P rese *t invention is also related to a 

prevention and/or treatment method of a patient, especially 
a human patient, preferably affected by lung injuries 
and/or diseases or by oxidative stress-related disorder 
wherein a sufficient amount of the pharmaceutical 
25 composition according to the invention is administered to 
said patient in order to treat, avoid and/or reduce the 
symptoms of said injuries and/or diseases. 

Other injuries and/or diseases which can be 
prevented and/or treated are injuries and/or diseases 
30 affected by the presence of pro-oxidant agents or oxidative 
stress, such as specific cardio-vascular diseases l ike 
arteriosclerosis, neurodegenerative disorders such as 



a 
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Alzheimer's disease, Parkinson's disease, amyotrophic 
lateral sclerosis, apoptosis and inflammatory reactions and 
some allergic reactions such as asthma, hay fever and 
eczema, high bone mass syndrome, osteopetrosis, 
5 osteoporosis -pseudoglioma syndrome, and Bardet-Biedl 
syndrome 1 . 

The pharmaceutically acceptable carrier 
according to the invention is any compatible non-toxic 
substance suitable for administering the composition 

10 according to the invention to a human patient. 
Pharmaceutically acceptable carriers according to the 
invention suitable for oral administration^ are the ones 
well known by the person skilled in the art, such as 
tablets, coated or non-coated pills, capsules, spray-gas, 

15 patches, gels, solutions or syrups. Pharmaceutically 
acceptable carriers vary according to the mode of 
administration ( intravenous , intramuscular , subcutaneous , 
parenteral, etc.), and may comprise also adjuvants well 
known by the person skilled in the art to increase, reduce 

20 and/or regulate humoral, local and/or cellular response of 
the immune system . 

The pharmaceutical composition according to 
the invention may be prepared by the methods, generally 
applied by the person skilled in the art in the preparation 

25 of various pharmaceutical compositions, wherein the 
percentage of the active compound/pharmaceutical ly 
acceptable carrier can vary within very large ranges, only 
limited by the tolerance of the patient to said 
pharmaceutical composition, and wherein the limits are 

30 particularly determined by the frequency of administration 
and the possible side-effects of the active compounds or 
its pharmaceutically acceptable carrier. 
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Another aspect of the invention is related to 
the use of the diagnostic device according to the invention 
for performing upon the patient or upon a biological fluid 
obtained from the patient, a diagnosis, a dosage and/or a 
5 monitoring of the above-mentioned injuries or diseases or 
oxidative stress-related disorders affecting the patient. 

A further aspect of the present invention is 
related to a cell or a non-human animal, preferably a 
mammal such as a mouse or a rat, transformed by the vector 
10 according to the invention and overexpressing the 
polypeptide according to the invention, or a non-human 
animal, preferably a mammal such as a mouse or a rat, 
genetically modified by a partial or total deletion of its 
genomic sequence encoding the polypeptide according to the 
15 invention (knock-out non-human mammal) and obtained by 
methods well known by the person skilled in the art such as 
the one described by Kahn et al . (Cell, Vol. 92, pp. 593- 
596 (March 1998) ) . 

Other examples of genetically modified non- 
20 human animals according to the invention may be a 
transgenic non-human animal comprising an inhibitor 
according to the invention, preferably an antisense nucleic 
acid sequence complementary to the nucleotide sequence 
according to the invention so placed as to be transcribed 
25 into antisense mRNA which is complementary to the 
nucleotide sequence according to the invention and which 
hybridises to said nucleotide sequence, thereby reducing or 
blocking its translation. 

Further aspects of the present invention will 
30 be described in the enclosed non-limiting examples in 
reference to the following Figures. 
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Brief description of the drawings 



15 



Figure 1 



5 Figure 2 



10 Figure 3 



Figure 4 



Figure 5 



20 



represents a dot blot analysis of mRNA 
encoding the polypeptide according to the 
invention in various types of human tissues, 
represents a Northern blot analysis of mRNA 
encoding the polypeptide according to the 
invention in a rat lung after administration 
of lipopolysaccharides (LPS) inducing an 
inflammatory reaction of the lung, 
represents a Northern blot analysis of mRNA 
encoding the polypeptide according to the 
invention in a rat lung after intraperitoneal 
injection of pneumotoxicants. 

is a schematic representation of the human 
genomic sequence, the complete cDNA sequence 
and the corresponding amino acid sequence, 
represents respectively the alignment of the 
sequences of the human B18 polypeptide 
according to the invention with the 
corresponding rat and mouse sequences. 



Example 1 



Homology 



between 



the 



B18 polypeptide 



according to the invention with other known 
nucleotide or amino acid sequences 

25 The BLAST 2.0 software (gapped BLAST at the 

NCBI Internet site) was used for searching for homologies 
between human B18 (162 amino acids) and known polypeptides 
in databases (GenBank, SwissProt) . Said search did not give 
perfect alignment with known peptides from different 

30 species (Table 1) . Homologies of the human B18 cDNA (805 
nucleotides) with GenBank, EMBL, DDBJ and PDB deposited 
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nucleotide sequences (Table 2) and GenBank Expression 
Sequence TAGS (ESTs) were noted. 
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Table 1 ; Homologies of the B18 proteins (162 amino 

acid) with other proteins 



Name 


NCBI ID 


Identity (%) 
Homology (%) 


Membrane protein 
(synechocystis sp. ) 


1652859 


57/129 (44%) 
81/129 (62%) 


Peroxisomal -like protein 
(Aspergillus f umigatus) 


2769700 


56/176 (31%) 
90/176 (50%) 


Haein HI0572 hypothetical 
protein (Haemophilus 
influenzae) 


1723174 


53/146 (36%) 
80/146 (54%) 


PMP2 0 (Schizosaccharomyces 
pombe) 


AJ002536 


54/161 (33%) 
85/161 (52%) 


Peroxisomal membrane 
protein A (PMP 2Q) (Candida 
boidinii) 


130360 


59/170 (34%) 
89/170 (51%) 


Pe roxi s oma 1 membrane 
protein B (PMP 20) (Candida 
boidinii) 


130361 


58/170 (34%) 
88/170 (51%) 


Putative peroxisomal 
protein PMP from yeast 
( Saccharomyces cerevisiae) 


1709682 


41/138 (29%) 
72/138 (51%) 


Alkylhydroperoxide 
reductase C22 protein 
(Escherichia coli) 


P26427 


36/126 (28%) 
58/126 (45%) 



Table 2 



Name 


Access NO 


Identity 


Human mRNA down-regulated in 
cells infected by adenovirus 5 


U82616 


259/263 (98%) 


Human mRNA down- regulated in 
cells infected by adenovirus 5 


U82615 


300/321 (93%) 



5 
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In the Table 2, an identity of 98% has been 
obtained with the alignment of 259 nucleotides of CDNA B18, 
which comprises in its totality 805 nucleotides, with 263 
nucleotides of U82616 CDNA. A similar identity has been 
5 obtained with the U82615 sequence. 

The sequence SEQ ID NO 1 comprising 805 
nucleotides presents a homology with several EST sequences 
obtained from a human and from a mouse, having the 
following references : 
10 Human : 

AA130751, N42215, W38597, N91311, N68467, AA187737, 
N68916, W00593, R88950, AA181884, H20154, H66666 
Mouse : 

AA220019, AA123351, AA087129, AA255021, AA249897, W71344 

IS 

Example 2 ? Tissue detagf-irm 

A human RNA master Blot (Clontech) containing. 
100-500 ng of poly-A + human RNA in each dot (normalised to: 
the tnRNA expression levels of eight different housekeeping 
genes) was hybridised with a 554 bp-long B18 probe labelled 
with 32 P, and quantified , using Phosphorimaging Technology. 
As shown in Figure 1, B18 mRNA is present in all tissues 
examined but predominantly in trachea, lung, kidney, 
thyroid gland, stomach, colon, heart and some regions of 
25 the brain. Highest expression has been noted in the thyroid 
tissue. This presence is probably correlated with the 
possible antioxidant activity of the B18 polypeptide 
according to the invention. 

30 Example 3 ; Inflammatory reaction 

Figure 2 represents a Northern blot analysis 
of rat lung mRNA after 6, 4 8 and 72 hours after 



20 
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lipopolysaccharicies (LPS) instillation inducing an 
inflammatory reaction in the lung. 

A Northern blot containing 15 fig of total RNA 
in each lane was hybridised with a 22 5 bp- long rat B18 
5 probe, stripped and reprobed with a 572 bp-long rat p-actin 
probe, both labelled with 32 P. Northern blot was quantified 
using Phosphorimaging Technology and the B18 mRNA data were 
normalised to p-actin mRNA level. 

10 Example 4 ; Pneumotoxic r eaction 

Figure 3 represents a Northern blot analysis 
of rat lung mRNA after intraperitoneal injection of 
pneumotoxicants (4 -ipomeanol , 1- (3-fyryl) -4 -hydroxypentanone 
(IPO) , methyl cyclopentadienyl manganese tricarbonyl (MMT) 

15 or alpha naphtylthiourea (ANTU) ) . These agents are known to 
induce in the lung acute lesions of Clara (IPO) and 
alveolar cells (MMT) as well as increasing the permeability 
of the alveolar/blood barrier (ANTU) . A Northern blot 
containing 15 [ig of total RNA in each lane was hybridised 

20 with a 225 bp-long rat B18 probe, stripped and reprobed 
with a 572 bp-long p-act J in probe both labelled with 32 P. 
The Northern blot- was quantified using Phosphorimaging 
Technology and rat B18 mRNA data were normalised to p-actin 
mRNA level . 

25 

Example 5 : Proteinic activity of the B18 polypeptide 

An amino analysis of the complete human B18 
amino acid sequence shows that said polypeptide presents 
specific portions showing an homology with other anti- 
30 oxidant enzymes (starting from a Leucine at position 36 
until a Cysteine at position 47) and an other portion 



WO 99/09054 - 

PCT/BE98/00124 
18 

having an important homology with beta chains of ATP 
synthase (starting from a Glutamic acid at position 13 
until a Glycine in position 38) . 

Furthermore, the B18 amino acid sequence 
5 according to the invention shows an important homology with 
an Aspergillus fumigatus allergen (34% identity and 60% 
homology by using clustal V sequence alignment), especially 
upon the portion of said B18 polypeptide having possible 
antioxidant properties. Therefore, it is possible that a 
10 peroxisomal protein (possibly homologous to B18 protein) is 
able to induce and to bind IgE from patients sensitised to 
Aspergillus fumigatus peroxisomal proteins after an 
induction of the patient immune system with Aspergillus 
fumigatus allergen. This mechanism can be compared to a 
X5 reaction obtained with the manganese superoxide dismutase 
(MnSOD) wherein the human MnSOD is able to bind to IgE from 
patients sensitised to Aspergillus fumigatus MnSOD. 

Furthermore, the Inventors have identified a 
portion of the B18 human polypeptide which presents an 
20 homology with a Cyclophilin-binding domain of Candida 
boidinii PMP20 (receptor, of the immuno- suppressant drug 
cyclosporin A). Said possible Cyclophilin-binding domain 
is starting from the Threonine in position 150 until the 
Leucine in position 161. 

25 

Example 6 : B18 human gene and mRNA al^^. tivp «p i ^-i^ 

As represented in the enclosed Figure 4, the 
Inventors have identified upon the genomic DNA (SEQ ID NO 
10) 5 exons and 5 introns . By RT-PCR (using primers 5'- 
30 gggtatgggactagctggcg-3 ' and 5 ■ -ctggccaacattccaattgcag-3 ' ) 
and according to the genomic sequence, 4 different cDNAs 
corresponding to the transcription of the said genomic DNA 
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have been identified in human lung and in human brain. A 
first cDNA of 736 bp corresponds to the cDNA encoding the 
complete amino acid sequence of the B18 protein according 
to the invention. However, 3 other cDNAs of 601, 604 and 
5 469 bp were also identified, and comprise specific 
splicings of one or more exons . 

Therefore, another aspect of the present 
invention is related to said specific portions of the 
complete genomic or CDNA nucleotide sequence according to 
10 the invention or to specific portions of the complete amino 
acid sequence of the B18 protein according to the 
invention, which could be used also as specific markers of 
the B18 activity, preferably the ant i- oxidative activity. 



15 TTvaTti pTg 7 : Knock -out mouse 

Exons of a mouse genomic sequence encoding 
the B18 polypeptide according to the invention have been 
deleted by homologous recombination. Said homologous 
recombination has been obtained with a genetic sequence 

20 comprising a neomycin resistant gene. The targeting vector 
with said gene and a J thymidine kinase (in order to 
eliminate non-homologous recombinants with ganciclovir) has 
been prepared. Said recombination was used for the deletion 
of one or more exons of the B18 polypeptide. After 

25 electroporation of ES cells with the targeting vector, 
positive clones having incorporated homologous 
recombination were identified by Southern blot with 
labelled probes. Aggregation of said positive clones with a 
morula from a Swiss pseudo-pregnant mouse produces several 

30 chimeric mice which survive after birth. Several homozygote 
mice are obtained by cross-breeding and are used as a model 
for the above-mentioned diseases. 



10 
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Similar experiments may be done with another 
mammal whose B18 sequence is known (the B18 sequence of a 
mouse and a rat and their alignment with the human sequence 
is shown in the enclosed Figure 5) . 

Example 8 , QhrpMosoT^ l^li^^ n t pi q ^ nr 

Radiation hybrid clones (GeneBridge 4 
Radiation Hybrid Panel, Research Genetics) were used for 
performing chromosome localisation by pgr with two 
different pairs of primers (5 ' -caggttcaccttgttccctggctc-3 ' 
(SEQ ID NO 14), 5'-atgttatgcaaccctttgcgacac-3' (SEQ id NO 
17) and 5'-gtgttt g aaggggagccagggaac-3' (SEQ id NO 18), 
5' -agagacagggtttcaccatcttgg-3' (SEQ ID NO 19)). 

The Inventors have located B18 genomic 
15. sequence- on human chromosome liqi 3 . B18 gene has been 
located 7.15-6.1 cR from marker D11S913 between markers 
D11S1963 and D11S4407 (Genome Database internet site) . 

Unknown genes linked to different disorders 
have been localised in the same region of chromosome 11 . 
Therefore, B18 gene is possibly associated with these 
disorders : 

- atopy (atopic hypersensitivity: asthma, hay fever and 
eczema; MIM n°l47050 at OMIM of NCBI internet site), 

- high bone mass syndrome (MIM n°601884) , 
25 - osteopetrosis (MIM n°259700) , 

- osteoporosis-pseudoglioma syndrome (MIM n°259770) and 

- Bardet-Biedl syndrome 1 (MIM n°209901) . 



20 



WO 99/09054 



PCT/BE98/00124 



21 



10 



CLAIMS 

1. Amino acid sequence having more than 70% 
homology with the sequence SEQ ID NO 2 . 

2. Amino acid sequence according to claim 1, 
having more than 85% homology with the sequence SEQ ID NO 
2 . 

3 . Amino acid sequence according to claim 1 
or 2, having more than 95% homology with the sequence SEQ 
ID NO 2 . 

4. Amino acid sequence according to any one 
of the preceding claims, corresponding to SEQ ID NO 2 or an 
immunoreactive portion thereof. 

5. Nucleotide sequence encoding the amino 
acid sequence according to any one of the preceding claims 

15 and presenting more than 70% homology with SEQ ID NO 1 or 
its complementary strand. 

6. Nucleotide sequence according to claim 5, 
having more than 8 5% homology with the sequence SEQ ID NO 1 
or its complementary strand. 

20 7. Nucleotide sequence according to claim 5 

more than 95% homology wit^h the sequence SEQ ID NO 1 or its 

complementary strand. 

8- Nucleotide sequence according to any one 

of the claims 5 to 7 , corresponding to the sequence SEQ ID 
25 NO 1, its complementary strand or a portion thereof 

specific for SEQ ID NO 1 and comprising more than 15 base 

pairs . 

9. Vector comprising the nucleotide sequence 
according to any one of the claims 5 to 8 . 
30 10. Inhibitor directed against the amino acid 

or nucleotide sequence according to any one of the claims 1 
to 8 . 
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11. Inhibitor according to claim i 0/ being an 
antibody, preferably a monoclonal antibody, or a portion of 
said antibody. 

12. Diagnostic device comprising an element 
5 selected from the group consisting of the amino acid 

sequence according to any one of the claims 1 to 4, the 
nucleotide sequence according to any one of the claims 5 to 
8, the inhibitor according to claim io or n, their 
portions or a mixture thereof. 

10 13 * Method for the ^ vitro detection of lung 

injuries and diseases or oxidative stress-related diseases 
and disorders, especially inflammatory diseases, comprising 
the steps of : 

- isolating a sample from a body fluid of a patient, 
15 preferably a human patient, 

- possibly inhibiting the contaminants present in said 
sample, 

• - put in contact said sample with an element selected from 
the group consisting of the amino acid sequence 
20 according to any one of the claims 1 to 4, the 
nucleotide sequence according to any one of the claims 5 
to 8, the inhibitor according to claim 10 or n, their 
portions or a mixture thereof, and 

- detecting a reaction of a molecule present in said 
25 sample with said element. 

14. Pharmaceutical composition comprising a 
pharmaceutical^ acceptable carrier and an element selected 
from the group consisting of the amino acid sequence 
according to any one of the claims 1 to 4, the nucleotide 
sequence according to any one of the claims 5 to 8 , the 
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inhibitor according to claim 10 or 11, their portions or a 
mixture thereof . 

15. Use of the pharmaceutical composition 
according to claim 14 for the manufacture of a medicament 

5 for the prevention and/or the treatment of lung injuries or 
diseases, and of oxidative stress-related diseases or 
disorders, such as specific cardio- vascular diseases like 
arteriosclerosis, neurodegenerative disorders such as 
Alzheimer's disease, Parkinson 1 s disease, amyotrophic 
10 lateral sclerosis, apoptosis and inflammatory reactions, 
allergic reactions such as asthma, hay fever and eczema, 
high bone mass syndrome, osteopetrosis, osteoporosis - 
pseudoglioma syndrome, and Bardet-Biedl syndrome 1. 

16. Cell transformed by the vector according 
15 to claim 9 or comprising a partial or total deletion of its 

nucleotide sequence according to any one of the claims 5 to 
8 . 

17. Non-human animal, preferably a mammal, 
transformed by the vector according to claim 9 or 

20 comprising a partial or total deletion of its nucleotide 
sequence according to any ; one of the claims 5 to 8 . 
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SEQUENCE LISTING 



(1) GENERAL INFORMATION: 



(i) APPLICANT: 

(A) NAME: UNIVERSITE CATHOLIQUE DE LOUVAIN 

Halles Universitaires 

(B) STREET: Place de 1* Universite, 1 

(C) CITY: LOUVAIN- LA-NEUVE 

(E) COUNTRY: BELGIUM 

(F) POSTAL CODE { ZIP) : B-1348 



(A) NAME: UNIVERSITE DE MONS-HAINAUT 

(B) STREET: Place du Pare 20 

(C) CITY: MONS 

(E) COUNTRY: BELGIUM 

(F) POSTAL CODE (ZIP): B-7000 



(ii) TITLE OF INVENTION: PEROXI SOME-ASSOCIATED PEPTIDE, NUCLEOTIDE 
SEQUENCE ENCODING SAID PEPTIDE 'AND THEIR USES IN THE 
DIAGNOSTIC AND/ OR THE TREATMENT OF LUNG INJURIES AND 
DISEASES, AND OF OXIDATIVE STRESS -RELATED DISORDERS 

(iii) NUMBER OF SEQUENCES : 19 



(iv) COMPUTER READABLE FORM: 

(A) MEDIUM TYPE: Floppy disk 

(B) COMPUTER: IBM PC compatible 

(C) OPERATING SYSTEM: PC-DOS/MS-DOS 

(D) SOFTWARE: Patentln Release #1.0, Version #1.30 (EPO) 



(2) INFORMATION FOR SEQ ID NO: 1: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 805 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(iii) HYPOTHETICAL: NO 

(iv) ANTI-SENSE: NO 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME /KEY : CDS 

<B) LOCATION: 193. . 681 



<xi) SEQUENCE DESCRIPTION: SEQ ID NO: 1: 



GCCAGGAGGC GGAGTGGAAG TGGCCGTGGG GCGGGTATGG GACTAGCTGG CGTGTGCGCC 
CTGAGACGCT CAGCGGGCTA TATACTCGTC GGTGGGGCCG GCGGTCAGTC TGCGGCAGCG 



60 
120 
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G CAGCAAG AC GGTGCAGTGA AGGAGAGTGG GCGTCTGGCG GGGTCCGCAG TTTCAGCAGA 180 

GCCGCTGCAG CC ATG GCC CCA ATC AAG GTG GGA GAT GCC ATC CCA GCA 22 8 

Met Ala Pro He Lys Val Gly Asp Ala He Pro Ala 
1 5 10 

GTG GAG GTG TTT GAA GGG GAG CCA GGG AAC AAG GTG AAC CTG GCA GAG 27 6 

Val Glu Val Phe Glu Gly Glu Pro Gly Asn Lys Val Asn Leu Ala Glu 
15 20 25 

CTG TTC AAG GGC AAG AAG GGT GTG CTG TTT GGA GTT CCT GGG GCC TTC 32 4 

Leu Phe Lys Gly Lys Lys Gly Val Leu Phe Gly Val Pro Gly Ala Phe 
30 35 40 

ACC CCT GGA TGT TCC AAG ACA CAC CTG CCA GGG TTT GTG GAG CAG GCT 372 
Thr Pro Gly Cys Ser Lys Thr His Leu Pro Gly Phe Val Glu Gin Ala 
45 50 55 60 

GAG GCT CTG AAG GCC AAG GGA GTC CAG GTG GTG GCC TGT CTG AGT GTT 420 
Glu Ala Leu Lys Ala Lys Gly Val Gin Val Val Ala Cys Leu Ser Val 
65 70 t 75 

AAT GAT GCC TTT GTG ACT GGC GAG TGG GGC CGA GCC CAC AAG GCG GAA 4 68 

Asn Asp Ala Phe Val Thr Gly Glu Trp Gly Arg Ala His Lys Ala Glu 
80 85 90 

GGC AAG GTT CGG CTC CTG GCT GAT CCC ACT GGG GCC TTT GGG AAG GAG 516 
Gly Lys Val Arg Leu Leu Ala Asp Pro Thr Gly Ala Phe Gly Lys Glu 
95 100 105 

ACA GAC TTA TTA CTA GAT GAT TCG CTG GTG TCC ATC TTT GGG AAT CGA 564 
Thr Asp Leu Leu Leu Asp Asp Ser Leu Val Ser He Phe Gly Asn Arg 
HO 115 120 

CGT CTC AAG AGG TTC TCC ATG GTG GTA CAG GAT GGC ATA GTG AAG GCC 612 
Arg Leu Lys Arg Phe Ser Met Val Val Gin Asp Gly He Val Lys Ala 
125 130 135 140 

CTG AAT GTG GAA CCA GAT GGC ACA GGC CTC ACC TGC AGC CTG GCA CCC 660 
Leu Asn Val Glu Pro Asp Gly Thr Gly Leu Thr Cys Ser Leu Ala Pro 
145 150 155 

AAT ATC ATC TCA CAG CTC TGA GGCCCTGGGC CAGATTACTT CCTCCACCCC 711 
Asn He He Ser Gin Leu * 
160 

TCCCTATCTC ACCTGCCCAG CCCTGTGCTG GGGCCCTGCA ATTGGAATGT T GGC CAGATT 771 
TCTGCAATAA ACACTTGTGG TTTGCGGAAA AAAA 80 5 



(2) INFORMATION FOR SEQ ID NO: 2: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 163 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 2: 
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Met Ala Pro lie Lys Val Gly Asp Ala lie Pro Ala Val Glu Val Phe 
15 10 15 

Glu Gly Glu Pro Gly Asn Lys Val Asn Leu Ala Glu Leu Phe Lys Gly 
20 25 30 

Lys Lys Gly Val Leu Phe Gly Val Pro Gly Ala Phe Thr Pro Gly Cys 
35 40 45 

Ser Lys Thr His Leu Pro Gly Phe Val Glu Gin Ala Glu Ala Leu Lys 
50 55 60 

Ala Lys Gly Val Gin Val Val Ala Cys Leu Ser Val Asn Asp Ala Phe 
65 70 75 80 

Val Thr Gly Glu Trp Gly Arg Ala His Lys Ala Glu Gly Lys Val Arg 
85 90 95 

Leu Leu Ala Asp Pro Thr Gly Ala Phe Gly Lys Glu Thr Asp Leu Leu 
100 105 110 

Leu Asp Asp Ser Leu Val Ser lie Phe Gly Asn Arg Arg Leu Lys Arg 
115 120 125 

Phe Ser Met Val Val Gin Asp Gly lie Val Lys Ala Leu Asn Val Glu 
130 135 140 

Pro Asp Gly Thr Gly Leu Thr Cys Ser Leu Ala Pro Asn lie lie Ser 
145 150 155 160 

Gin Leu * 



(2) INFORMATION FOR SEQ ID NO: 3: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 780 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genoinic) 

( iii ) HYPOTHETICAL : NO 

(iv) ANTI-SENSE: NO 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Rattus Rattus 

(ix) FEATURE: 

(A) NAME /KEY: CDS 

(B) LOCATION: 136 . . 624 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 3: 
TGCGTCCTAG GCAGCATAGC CGGATCGGTG CTCCGTGCAT CGGCTACTTG GACGTGCGTG 60 



GCAGGCAGAG CAGGCCGGAA AG GAG C AG GT TGGGAGTGTG GTGGGGCCCG CAGCTTCAGC 12 0 
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AGTGCCGCGG TGACTATGGC CCCGATCAAG GTGGGAGACA CCATTCCCTC AGTGGAGGTA 18 0 

TTTGRAGGGG AACCTGGAAA GAAGGTGAAC TTGGCAGAGC TGTTCAAGGA CAAGAAAGGT 2 40 

GTTTTGTTTG GAGTCCCTGG GG C ATT T AC A CCTGGCTGTT CCAAGACCCA TCTGCCTGGG 300 

TTTGTGGAGC AAGCCGGAGC TCYGAAGGCC AAGGGAGCAC AAGTGGTGGC CTGTCTGAGT 360 

GTTAATGATG YCTTCGTGAC TGCAGAGTGG GGTCGAGCCC ACCAGGCAGA AGGCAAGGTT 420 

CAGCTCCTGG CTGACCCCAC TGGAGCTTTT GGAAAGGAGA CAGATTTACT ACTAGATGAT 480 

TCTTTGGTGT CTCTCTTTGG GAATCGTCGG CTAAAAAGGT TCTCCATGGT GATAGACAAG 54 0 

GGCGTAGTAA AGGCACTGAA CGTGGAGCCG GATGGCACAG GCCTCACCTG CAGCCTGGCC 600 

CCCAACATCC TCTCACAACT CTGAGGCCCT GACCAGAATG TCCTCTGACT CTCCCATCTC 660 

CTCCACCCAG CTCTGGGCCA AAGGCCCAGT ACCTCCTTAC CTGAGGGCCA CT G GAATGGA 720 

AC CTT GACAA TATTTCTGCA ATAAACAGTT TAATTTGTG& AAAAAAAAAA AAAAAAAAAA 780 



(2) INFORMATION FOR SEQ ID NO: 4: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 162 amino acids 

(B) TYPE: amino acid 

{C> STRANDEDNESS: single 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(iii) HYPOTHETICAL: NO 

(iv) ANTI-SENSE: NO 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Rattus Rattus 

<ix} FEATURE: 

(A) NAME/ KEY : Modif ied-si te 

(B) LOCATION: 17 

(D) OTHER INFORMATION: /product= M Glu or Gly" 

(ix) FEATURE: 

(A) NAME/KEY: Modi f ied-site 

(B) LOCATION: 63 

. (D) OTHER INFORMATION : /product^ "Leu or Pro" 

(ix) FEATURE: 

(A) NAME/ KEY : Modi f ied-si te 

(B) LOCATION: 79 

(D) OTHER INFORMATION :/product= "Ala or Val" 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 4: 



Met Ala Pro lie Lys Val Gly Asp Thr lie Pro Ser Val Glu Val Phe 
15 10 15 
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Xaa Gly Glu Pro 
20 

Lys Lys Gly Val 
35 

Ser Lys Thr His 
50 

Ala Lys Gly Ala 
65 

Val Thr Ala Glu 



Leu Leu Ala Asp 
100 

Leu Asp Asp Ser 
115 

Phe Ser Met Val 
130 

Pro Asp Gly Thr 
145 

Gin Leu 



5 

Gly Lys Lys Val 



Leu Phe Gly Val 
40 

Leu Pro Gly Phe 
55 

Gin Val Val Ala 
70 

Trp Gly Arg Ala 
85 

Pro Thr Gly Ala 



Leu Val Ser Leu 
120 

lie Asp Lys Gly 
135 

Gly Leu Thr Cys 
150 



Asn Leu Ala Glu 
25 

Pro Gly Ala Phe 



Val Glu Gin Ala 
60 

Cys Leu Ser Val 
75 

His Gin Ala Glu 
90 

Phe Gly Lys Glu 
105 

Phe Gly Asn Arg 



Val Val Lys Ala 
140 

Ser Leu Ala Pro 
155 



Leu Phe Lys Asp 
30 

Thr Pro Gly Cys 
45 

Gly Ala Xaa Lys 



Asn Asp Xaa Phe 
80 

Gly Lys Val Gin 
95 

Thr Asp Leu Leu 
110 

Arg Leu Lys Arg 
125 

Leu Asn Val Glu 



Asn lie Leu Ser 
160 



(2) INFORMATION FOR SEQ ID NO: 5: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 675 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(iii) HYPOTHETICAL: NO 

(iv) ANTI-SENSE: NO 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Mouse 

(ix) FEATURE: 

(A) NAME /KEY: CDS 

(B) LOCATION:99. .588 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 5: 

TGCTCCGTGC ATCGACGTGC TTGGCAGGCA GAGCAGGCCG GAAAGAAGCA GGTTGGGAGT 60 

GTGGCGGAGC CCGCAGCTTC AGCAGCTCCG CGGTGACCAT GGCCCCGATC AAGGTGGGAG 120 

ATGCCATTCC CTCAGTGGAG GTATTTGAAG GGGAACCGGG AAAGAAGGTG AACTTGGCAG 180 

AGCTGTTCAA GGGCAAGAAA GGTGTTTTGT TTGGAGTCCC TGGGGCATTT ACACCTGGCT 240 
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GTT CTAAGAC CCACCTGCCT GGGTTTGTGG AGCAAGCTGG AGCTCTGAAG GCTAAGGGAG 300 

CGCAGGTGGT GGCCTGTCTG AGCGTTAATG ACGTCTTTGT GATTGAAGAG TGGGGTCGAG 360 

CCCACCAGGC AGAAGGCAAG GTTCGGCTCC TGGCTGACCC CACTGGAGCC TTTGGGAAGG 420 

CGACAGACTT AT TATT G GAT GATTCTTTGG TGTCTCTCTT TGGGAATCGT CGGCTGAAAA 4 80 

GGTTCTCCAT GGT GATAGAG AACGGCATAG TGAAGGCACT GAACGTGGAG CCAGATGGCA 54 0 

CAGGCCTCAC CTGCAGCCTG GCCCCCAACA TCCTCTCCCA ACTCTGAGGC CCTGGCCAGA 600 

TGTCCTCTGA CTCTCCCATC TCTCCCACCC GGCTCT AG GC CAAAAGGCTC GGTACCTCCT 660 

TACT GG GAG C C AC GT 675 
(2) INFORMATION FOR SEQ ID NO: 6: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 162 amino acids 

(B) TYPE: amino acid 

( C ) STRANDEDNESS : single 

( D) TOPOLOGY : linear 

(ii) MOLECULE TYPE: peptide 

(iii) HYPOTHETICAL: NO 

(iv) ANTI-SENSE: NO 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Mouse 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 6: 

Met Ala Pro lie Lys Val Gly Asp Ala lie Pro Ser Val Glu Val Phe 
15 10 15 

Glu Gly Glu Pro Gly Lys Lys Val Asn Leu Ala Glu Leu Phe Lys Gly 
20 25 30 

Lys Lys Gly Val Leu Phe Gly Val Pro Gly Ala Phe Thr Pro Gly Cys 
35 40 45 

Ser Lys Thr His Leu Pro Gly Phe Val Glu Gin Ala Gly Ala Leu Lys 
50 55 60 

Ala Lys Gly Ala Gin Val Val Ala Cys Leu Ser Val Asn Asp Val Phe 
65 70 75 80 

Val lie Glu Glu Trp Gly Arg Ala His Gin Ala Glu Gly Lys Val Arg 
85 90 95 

Leu Leu Ala Asp Pro Thr Gly Ala Phe Gly Lys Ala Thr Asp Leu Leu 
100 105 110 

Leu Asp Asp Ser Leu Val Ser Leu Phe Gly Asn Arg Arg Leu Lys Arg 
115 120 125 
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Phe Ser Met Val lie Asp Asn Gly lie Val Lys Ala Leu Asn Val Glu 
130 135 140 



Pro Asp Gly Thr Gly Leu Thr Cys Ser Leu Ala Pro Asn lie Leu Ser. 
145 * 150 155 160 



Gin Leu 



(2) INFORMATION FOR SEQ ID NO: 7: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 469 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(iii) HYPOTHETICAL: NO 

(iv) ANTI-SENSE: NO 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/ KEY: CDS 

(B) LOCATION: 161. .382 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 7: 

GGGTATGGGA CTAGCTGGCG TGTGCGCCCT GAGACGCTCA GCGGGCTATA TACTCGTCGG 60 

TGGGGCCGGC GGTCAGTCTG CGGCAGCGGC AGCAAGACGG TGCAGTGAAG GAGAGTGGGC 12 0 

GTCTGGCGGG GTCCGCAGTT TCAGCAGAGC CGCTGCAGCC ATGGCCCCAA TCAAGGTTCG 180 

GCTCCTGGCT GATCCCACTG GGGCCTTTGG GAAGGAGACA GACTTATTAC TAGATGATTC 24 0 

GCTGGTGTCC ATCTTTGGGA AT CGACGTCT CAAGAGGTTC TCCATGGTGG TACAGGATGG 300 

CAT AGT GAAG GCCCTGAATG TGGAACCAGA TGGCACAGGC CTCACCTGCA GCCTGGCACC 360 

CAAT AT CAT C TCACAGCTCT GAGGCCCTGG GCCAGATTAC TTCCTCCACC CCTCCCTATC 42 0 

TCACCTGCCC AGCCGTGTGC TGGGGCCCTG CAATTGGAAT GTTGGCCAG 4 69 
(2) INFORMATION FOR SEQ ID NO: 8: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 601 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

( D ) TOPOLOGY : linear 

(ii) MOLECULE TYPE: cDNA 



( iii ) HYPOTHETICAL : NO 
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(iv) ANTI-SENSE: NO 

(vi) ORIGINAL SOURCE : 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/ KEY: CDS 

(B) LOCATION: 161. .514 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 8: 

GGGTATGGGA CTAGCTGGCG TGTGCGCCCT GAGACGCTCA GCGGGCTATA TACTCGTCGG 60 

TGGGGCCGGC GGTCAGTCTG CGGCAGCGGC AGCAAGACGG T GCAGT GAAG GAGAGTGGGC 120 

GTCTGGCGGG GTCCGCAGTT TCAGCAGAGC CGCTGCAGCC ATGGCCCCAA T C AAG ACAC A 180 

CCTGCCAGGG TTTGTGGAGC AGGCTGAGGC TCTGAAGGCC AAGGGAGTCC AGGTGGTGGC 24 0 

l 

CTGTCTGAGT GTTAAT GAT G CCTTTGTGAC TGGCGAGTGG GGCCGAGCCC ACAAGGCGGA 300 

AGGCAAGGTT CGGCTCCTGG CTGATCCCAC TGGGGCCTTT GGGAAG GAGA CAGACTTATT 360 

ACTAGATGAT TCGCTGGTGT CCATCTTTGG GAATCGACGT CTCAAGAGGT TCTCCATGGT 420 

GGTAC AG GAT GGCATAGTGA AGGCCCTGAA TGTGGAACCA GAT GGCACAG GCCTCACCTG 480 

CAGCCTGGCA CCCAAXATCA TCTCACAGCT CTGAGGCCCT GGGCCAGATT ACTTCCTCCA 540 

CCCCTCCCTA TCTCACCTGC CCAGCCCTGT GCTGGGGCCC TGCAATTGGA ATGTTGGCCA 600 

G 601 
(2) INFORMATION FOR SEQ ID NO: 9: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 604 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNES S : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(iii) HYPOTHETICAL: NO 

(iv) ANTI-SENSE: NO 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/ KEY : CDS 

(B) LOCATION: 161 . . 517 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 9: 
GGGTATGGGA CTAGCTGGCG TGTGCGCCCT GAGACGCTCA GCGGGCTATA TACTCGTCGG 60 
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TGGGGCCGGC GGTCAGTCTG CGGCAGCGGC AGCAAGACGG TGCAGTGAAG GAGAGTGGGC 120 

GTCTGGCGGG GTCCGCAGTT TCAGCAGAGC CGCTGCAGCC ATGGCCCCAA TCAAGGTGGG 18 0 

AGATGCC AT C CCAGCAGTGG AGGTGTTTGA AGGGGAGCCA GGGAACAAGG TGAACCTGGC 24 0 

AGAGCTGTTC AAGGGCAAGA AGGGTGTGCT GTTTGGAGTT CCTGGGGCCT TCACCCCTGG 300 

ATGTTCCAAG GTTCGGCTCC TGGCTGATCC CACTGGGGCC TTTGGGAAGG AG AC AGACT T 360 

ATTACTAGAT GATTCGCTGG TGTCCATCTT TGGGAATCGA CGTCTCAAGA GGTTCTCCAT 42 0 

GGTGGTACAG GAT G GCATAG TGAAGGCCCT GAATGTGGAA CCAGATGGCA CAGGCCTCAC 4 80 

CTGCAGCCTG GCACCCAATA TCATCTCACA GCTCTGAGGC CCTGGGCCAG ATTACTTCCT 54 0 

CCACCCCTCC CTATCTCACC TGCCCAGCCC TGTGCTGGGG CCCTGCAATT GGAATGTTGG 600 

CCAG 604 
(2) INFORMATION FOR SEQ ID NO: 10: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2710 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(iii) HYPOTHETICAL: NO 

(iv) ANTI-SENSE: NO 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/ KEY: exon 

(B) LOCATION: 2516. .2710 

(ix) FEATURE: 

(A) NAME/ KEY : exon 

(B) LOCATION: 2 074 . .2135 

(ix) FEATURE: 

(A) NAME/ KEY: exon 

(B) LOCATION: 1932. .1970 

(ix) FEATURE: 

(A) NAME/KEY: exon 

(B) LOCATION: 1728 . . 1859 

(ix) FEATURE: 

(A) NAME/ KEY : exon 

(B) LOCATION: 802 . . 936 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 10: 
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TCTGTCCCTT AGCGCCCCCG CGGGGGCTTA CCCCATCCCA CTCCATGACC TCCCCTCCCC 60 

CCATGGCGAA TTCCCACCTT TCTGTCTTTC ACTCACTTCC TGGAACCGTC CCCAGGGCCT 120 

TGGACCTTCC CCCTTCTCCT CCCAAACCTT GTGAGACCCC ATTCCCTTTC TACTTCATCC 180 

TGCTCTCAAC TTTTGGGCTC CTCAGAGGCC CTCACCCCTG ACTCTCTCTC CCTACCCACT 240 

CTGGTCCCAT GAAGCCCTCA AGTACTCTGG GGATGGATCC TTCCCCCTTC AAAAGATTCC 3 00 

TTCTTTTGTT CTACACCTCC TGGGTGTAGG GGCCTGGACA CCCTCCCCCA ACGTTCCACC 360 

TGCCGCTGCC CTTCCTCTTC CTCCTCCTGA GGGTGGGACC CTCAGACCTG GCCAAGATCC 420 

TCTCCCTCCA TGTTGTCAGG GACTCCTCCT CACCCCCAAA TACAGCCCTC TAGCCCCTGT 4 80 

CCATTTTATT CCACTCCTTT CCTGTAACCT AGACAGCATG TTATGCAACC CTTTGCGACA 540 

CAT GG G GAAA CCTTCCCTCC CTTCCTCTGT TGTCACCAAT GGCCCCTTAA GAGGAGCAGG 600 

GCCACCTTGA AACTT GGAGG AT AT GGGGTA ACCCAGT GGfe AGCGGGCAGG GAGGGCCCTT 660 

GGAAACT GAC AGGGCTGGAG TATCCTGCTG GGTTTCAGCC CCGGTTCCTG CAGGCACAGC 720 

TGCCAGGCTC TCTGTTCACC TTCCTGCCTC TGGTTTGCCC CGGCTCCCTC ACCCCCCTTA 7 80 

CCCTGGAGTC CTTCCTTCTA GGTGGGAGAT GCCATCCCAG CAGTGGAGGT GTTTGAAGGG 84 0 

GAG CC AG G GA ACAAG GTGAA CCTGGCAGAG CTGTTCAAGG GCAAGAAGGG TGTGCTGTTT 900 

GGAGTTCCTG GGGCCTTCAC CCCTGGATGT TCCAAGGTGA GGCCCTTCCC CTTCT GAAGA 960 

TCAGGACCTG GGGATCTTTT GTGTTGCTCT TAAGTCCTCC ACATAGTCCT GAT AGGACT C 1020 

CTAAAAAGCA TTTCAGTGCC ATCACAAAAC AAGTAGAG CT GGGTAGAGCT GGGCGCGGTG 108 0 

GCTCACGCCT GTAATCCCAG CACTTTGGGA GGCCAAGGCG G GT G GAT C AC GAGGTCAG GA 114 0 

GTCCAAAACC AGCCTGGCCA AGAT GGTGAA ACCCTGTCTC T AC TAAAAAT GCAAAAAAAT 1200 

CAGCCGGATA TGGTGGCGGG CGCCTGTAAT CCCAGGTATT GGGGAGGCTG AGGCAGAGAA 12 60 

TTGCTTGAAC CCAGGAGGCG TAGGTTGCAG TGAGTGGAGA TCGTGCCTCT GCAGTCCAGC 1320 

CTGGGTGAAA GAG C GAG ACT CCGTCTCAAA ATGAAAAAAA AAAAAGAAAA CAAGTAGAGA 138 0 

CTGCAAAAAG GGAACAGTAC CGGGAATGTT GGAGAAAAAC ATACTACAAT TAAATCCAAC 14 4 0 

ACCCCTGTTG GTCCTGCTAA ATGACAGG C A CTGTGGAAGG TGCTTGGGAC TCAGATAAAT 1500 

AAGACAAAGA TCTGCCCATG GAAAGTT CAC GTCTGGACCA TAAGGCATTA GGTTTCATTC 1560 

TGAGCTTCCT AGTGGCCAAG GCAAAAAGGA AATAGAATGG TTTAGACAGC TCTCATTGTC 1620 

TGATCAAAGG TGTTGAGGCA GAGCACTGAG GAGGGCCTGG AG AT AAAG G G TGGGCTGGGG 168 0 

GTCAGATGCA GTTATCCCTT TGCCGACCCT TTGTTCCCCT T C C T C AG AC A CACCTGCCAG 17 4 0 

GGTTTGTGGA GCAGGCTGAG GCTCTGAAGG CCAAGGGAGT CCAGGTGGTG GCCTGTCTGA 18 00 

GTGTTAATGA TGCCTTTGTG ACTGGCGAGT GGGGCCGAGC CCACAAGGCG GAAGGCAAGG 18 60 



WO 99/09054 



PCT/BE98/00124 



11 

T GAG GT GAG G GGCCTGCAGG GAGT CAGGAC CAGGTAGGAT ATTCTTCTTG TGACCTCTAC 1920 

TTTCTCTGCA GGTTCGGCTC CTGGCTGATC CCACTGGGGC CTTTGGGAAG GTGAGTGTTC 198 0 

CCCTGACCGC CACAGGGACA TGGCGGTGCG GGGAGCAGTG GGGGCCCTTG GCCTCTTCAA 204 0 

GGATTTCTGA CACTTTTCTC TGTCTCTTCT TAG GAG AC AG ACTTATTACT AGATGATTCG 2100 

CTGGTGTCCA TCTTTGGGAA TCGACGTCTC AAGAGGTAAA AGTGGAGAGT CCTCTGTGGA 2160 

GAAAGTCCTC TGTGGGAGAG AGTCCTCTGT GGGAGAGAGT CCTCTGTGGA GAGGGTCCTC 2220 
TGTGGGAAGA GTCGTCTGTG GGGGAGAT GT GTGGGAGAGA GTCCTGTGTG GGGAGAGTCT 228 0 

TCTGTAGGGG AGAGTCCTCT GGGGAGAGAG TCCTGTGTGG GGGAGAGTCC TCTGTGGGGA 234 0 

GAGTCCTCTG T GT GGAGAGA GTCCTGTGTG GTGGTGAGTC CTCTGTGGGG GAGAGTCCTC 2400 

TGTGGGGGGA GTCCTCTCTG GAGTTCTCTT GGGCCCCTGG CTGTTCACTG CCTGTCTCCA 2 4 60 

TGCCCAGCCT CCAAGCCCAG GCTGATGCAG CTGGCTGGGC CCCTCTTTCC GGCAGGTTCT 2520 

CCATGGTGGT ACAGGATGGC ATAGTGAAGG CCCTGAATGT GGAAC CAGAT GGCACAGGCC 258 0 

TCACCTGCAG CCTGGCACCC AATATCATCT CACAGCTCTG AGGCCCTGGG CCAGATTACT 264 0 

TCCTCCACCC CTCCCTATCT CACCTGCCCA GCCCTGTGCT GGGGCCCTGC AATTGGAATG 2700 

TT GG C CAGAT 2710 
(2) INFORMATION FOR SEQ ID NO: 11: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 25 base pairs 

(B) TYPE: nucleic acid 

( C ) STRANDEDNES S : s ingle 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 11: 
GCCATCCCAG CAGTGGAGGT GTTTG 25 
(2) INFORMATION FOR SEQ ID NO: 12: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 24 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 12: 
TTGAACAGCT CTGCCAGGTT CACC 
(2) INFORMATION FOR SEQ ID NO: 13: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 24 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO; 13: 
TGGAGGTGTT TGAAGGGGAG CCAG 
(2) INFORMATION FOR SEQ ID NO: 14: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 24 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 14: 
CAGGTTCACC TTGTTCCCTG GCTC 
(2) INFORMATION FOR SEQ ID NO: 15: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: nucleic acid 

( C ) STRANDEDNES S : s ingle 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 15: 
GGGTATGGGA CTAGCTGGCG 
(2) INFORMATION FOR SEQ ID NO: 16: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 22 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 
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(D) TOPOLOGY: 



linear 



(ii) MOLECULE TYPE: 



DNA (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 16: 



CTGGCCAACA TTCCAATTGC AG 



22 



(2) INFORMATION FOR SEQ ID NO: 17: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 24 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

( D ) TOPOLOGY : linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 17: 
AT GTTAT GCA ACCCTTTGCG ACAC 24 
(2) INFORMATION FOR SEQ ID NO: 18; 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 24 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 18: 
GTGTTTGAAG GGGAGCCAGG GAAC 24 
(2) INFORMATION FOR SEQ ID NO: 19: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 24 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 19: 
AG AG AC AG G G TTTCACCATC TTGG 



24 
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CLUSTAL V alignment of human and rat Bl 8 amino acid sequences ( Identity: 
90*, Homology: 97.5%) : 

3l8hum MAPI KVG DAI PAVEVF EGEPGNKVNuA£LF KGKKGVLFGVPGAFTPGCSK = SEQIDNOl 

B18rat MAPIKVGDTIPSVEVFEGEPGKKVNIAELFKDKKGVLFGVPGAFTPGCSK 

B18hum THLPGFVEQAEALKAKGVQVVACLSWDAFVTGEWGRAJiKAEGKV^ 

B18rat THLPGF^QAGALKAKGAQWACLSVNDVFVTAEWGRAHQAEGKVQLLAD FIG. 5a 

B18hum PTGAFGKETDLLLDDSLVSIFGNRRLKRFSMWQDGIVKAL>AAEPDGTGL 

B18rat PTGAFGKETDLLLDDSLVSLFGNRRLKRFSMVIDKGVVKAUJVEPDGTGL 

B18hum TCSLAPNIISQL 

B18rat TCSLAPNILSQL 



CLUSTAL V alignment of human and mouse B18 amino acid sequences (Identity: 
91%, Homology: 96%) : 

B18hum MAP I KVGDAI P AVEVFEGE P GN KWIAZLFKGKKGVLFGVP GAFT PGCSK 

B18mouse MAPI KVGDAI PS VEVFEGEPGKKVKIJVEI^GPCKGVLFGVPGAF^ 

Bl 8 hum THLPGFVEQAxIAIjKAKGVQVVACLSVNDAFVTGEWG^ 

B18mouse THLPGF\raQAGALKAKGAQWACLSWDVFVIEEWGRAHQAEGKV^ 

B18hum PTGAFGKETDLLLDDSLVSIFGNRRLKRFSMWQDGIVKALNVEPDGTGL 

B18mouse ptgafgkatdlllddslvslfgnrrlkrfsmvidngivkalnvepdgtgl 

B18hum TCSLAPNIISQL 

BISmouse TCSLAPNILSQL . 



CLUSTAL V alignment of human and rat cDNA sequences (identity: 612/7 80, 
78.5%) : 

B18hum GCCAGGAGGCGGAGTGGAAGTGGCCGTGGGGCGGGTATGGGACTAGCTGG 

B18rat TG CGTC CTAGGCAG 

B18hum CGTGTGCGCCCTGAGACGCTCAGCGGGCTATATACTCGTCGGTGGGGCCG 

B18rat CATA GCC GGA TCGGTGCTCCGTGCATCGGCTACTTGGAC — 

318hum GCGGTCAGTCTGCGGCAGCGGCAGCAAGACGGTGCAGTGAAGGAGAGTGG 

B18rat GTGCGTGGCAGGCAGAGCAGGCCGG AAAGGAGC AG GTT G G 
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FIG. 5b 

B18huin GCGTCTGGCGGGGTCCGCAGTTTCAGCAGAGCCGCTGCAGCCATGGCCCC 
Bl8rat GAGTGTGGTGGGGCCCGCAGCTTCAGCAGTGCCGCGGTGACTATGGCCCC 

* * + * * * + * * * ****** ******** * * * * * * + ******** 



Bl8hum AATCAAGGTGGGAGATGCCATCCCAGCAGTGGAGGTGTTTGAAGGGGAGC 
B18rat GATCAAGGTGGGAGACACCATTCCCTCAGTGGAGGTATTTGAAGGGGAAC 
************** **** * + ********** *********** * 

B18hum CAGGGAACAAGGTGAACCTGGCAGAGCTGTTCAAGGGCAAGAAGGGTGTG 
Bl8rat CTGGAAAGAAGGTGAACTTGGCAGAGCTGTTCAAGGACAAGAAAGGTGTT 

+ ++ * + ** + + + * + + * + * * * + + * * + + * * * * * * + * ****** ***** 

B18hum CTGTTTGGAGTTCCTGGGGCCTTCACCCCTGGATGTTCCAAGACACACCT 
Bl8rat' TTGTTTGGAGTCCCTGGGGCATTTACACCTGGCTGTTCCAAGACCCATCT 

+ + + + + + * + * + * * * * * + *********** * * * * 



Bl8hum GCCAGGGTTTGTGGAGCAGGCTGAGGCTCTGAAGGCCAAGGGAGTCCAGG 

Bl 8 ra t GCCTGGGTTTGTGGAGCAAGCCGGAGCTCTGAAGGCCAAGGGAGCACAAG 
* * * ************** ** + ******************* ** + 

Bl8hum TGGTGGCCTGTCTGAGTGTTAATGATGCCTTTGTGACTGGCGAGTGGGGC 
B18rat TGGTGGCCTGTCTGAGTGTTAATGATGTCTTCGTGACTGCAGAGTGGGGT 
*************************** * * * ******* ********* 

Bl8hum CGAGCCCACAAGGCGGAAGGCAAGGTTCGGCTCCTGGCTGATCCCACTGG 
B18rat CGAGCCCACCAGGCAGAAGGCAAGGTTCAGCTCCTGGCTGACCCCACTGG 
********* **** ************* ******* * *** * ******** 

Bl8hum GGCCTTTGGGAAGGAGACAGACTTATTACTAGATGATTCGCTGGTGTCCA 
B18rat AGCTTTTGGAAAGGAGACAGATTTACTACTAGATGATTCTTTGGTGTCTC 

* + * * * * * *********** * * * ************* ******* 

B18hum TCTTTGGGAATCGACGTCTCAAGAGGTTCTCCATGGTGGTACAGGATGGC 
Bl8rat TCTTTGGGAATCGTCGGCTAAAAAGGTTCTCCATGGTGATAGACAAGGGC 

+ * + * * * * * * * * ** ** ** *************** + + * * *** 

B18hum ATAGTGAAGGCCCTGAATGTGGAACCAGATGGCACAGGCCTCACCTGCAG 
B18rat GTAGTAAAGGCACTGAACGTGGAGCCGGATGGCACAGGCCTCACCTGCAG 

* * * * ***** ***** ***** ** *********************** 

B18hum CCTGGCACCCAATATCATCTCACAGCTCTGAGGCCCTGGGCCAGATTACT 
Bl8rat CCTGGCCCCCAACATCCTCTCACAACTCTGAGGCCCTGA-CCAGA — ATG 

****** ***** *** ******* ************* ***** * 

B18hum TCCTCCACCCCTCCCTATCTCACCTGCCCAGCCCTGTGCTGG-GGCCCTG 
Bl8rat TCCTCTGACTCTCCC— ATCTCCTCCACCCAGCTCTGGGCCAAAGGCCCAG 

***** * ***** ***** * ****** *** * + ***** + 

Bl8hum CA ATTGGAATG TTGGCCAGATTTCTGC 

Bl8rat TACCTCCTTACCTGAGGGCCACTGGAATGGAACCTTGACAATATTTCTGC 

* ******** ************* 

Bl8hum AATAAACACTTGTGGTTTGCGGAAAAAAA 

B 1 8 r a t AATAAACAGTT - TAATTTGT GAAAAAAAAAAAAAAAAAAAAA 

******** ** + **** * ******* 
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CLUSTAL V alignment of human and mouse cDNA sequences (Identity: 552/675, 
81. 8r) : 

FIG.5C 

BlShum GCCAGGAGGCGGAGTGGAAGTGGCCGTGGGGCGGGTATGGGACTAGCTGG 

BIQmouse 

B18hum CGTGTGCGCCCTGAGACGCTCAGCGGGCTATATACTCGTCGGTGGGGCCG 

BISmouse TGCTCCGTG CAT C G AC GT G CTT G 

Bl8hum GCGGTCAGTCTGCGGCAGCGGCAGCAAGACGGTGCAGTGAAGGAGAGTGG 

B18mouse GCAGGCAG AGCAGGCCGG AAAGAAGCAGGTTGG 

B18hum GCGTCTGGCGGGGTCCGCAGTTTCAGCAGAGCCGCTGCAGCCATGGCCCC 

Bl8mouse GAGTGTGGCGGAGCCCGCAGCTTCAGCAGCTCCGCGGTGACCATGGCCCC 

Bl8hum AAT C AAG GT GGGAGAT G C CAT C CC AGC AGT GGAGGT GT TT GAAGGGGAGC 

B18mouse GATCAAGGTGGGAGATGCCATTCCCTCAGTGGAGGTATTTGAAGGGGAAC 

BlShum CAGGGAACAAGGTGAACCTGGCAGAGCTGTTCAAGGGCAAGAAGGGTGTG" 

B18mouse CGGGAAAGAAGGTGAACTTGGCAGAGCTGTTCAAGGGCAAGAAAGGTGTT 

BlShum CTGTTTGGAGTTCCTGGGGCCTTCACCCCTGGATGTTCCAAGACACACCT 

BISmouse TTGTTTGGAGTCCCTGGGGCATTTACACCTGGCTGTTCTAAGACCCACCT 

B18hum GCCAGGGTTTGTGGAGCAGGCTGAGGCTCTGAAGGCCAAGGGAGTCCAGG 
B18mouse GCCTGGGTTTGTGGAGCAAGCTGGAGCTCTGAAGGCTAAGGGAGCGCAGG 

B18hum TGGTGGCCTGTCTGAGTGTTAATGATGCCTTTGTGACTGGCGAGTGGGGC 
BISmouse TGGTGGCCTGTCTGAGCGTTAATGACGTCTTTGTGATTGAAGAGTGGGGT 

B18hum CGAGCCCACAAGGCGGAAGGCAAGGTTCGGCTCCTGGCTGATCCCACTGG 
BISmouse CGAGCCCACCAGGCAGAAGGCAAGGTTCGGCTCCTGGCTGACCCCACTGG 

Bl 8hum GGCCTTTGGGAAGGAG^CAGACTTATTACTAGATGATTCGCTGGTGTCCA 
B18mouse AGCCTTTGGGAAGGCGACAGACTTATTATTGGATGATTCTTTGGTGTCTC 

B18hum ^ CTTT GG GAAT C GAC GT CT CAAGAG GTT CT C CAT G GT G GT AC AG GAT GGC 

Bl8moUse TCTTTGGGAATCGTCGGCTGAAAAGGTTCTCCATGGTGATAGACAACGGC 

B18hum AT AGT GAAG G C C CT GAAT GT G G AAC C AG AT G G C AC AG G C CT C AC CT GC AG 

BISmouse A TAGTGAAGGCACTGAACGTGGAGCCAGATGGCACAGGCCTCACCTGCAG 

B18hum CCTGGCACCCAATATCATCTCACAGCTCTGAGGCCCTGGGCCAGATTACT 
Bl8mouse CCTGGCCCCCAACATCCTCTCCCAACTCTGAGGCCCTGG-CCAGATG 

B18hum TCCTCCACCCCTCCCTATCTCACCTGCCCAGCCCTGTGCTGGGGCCCTGC 
BISmouse TCCTCTGACTCTCCC-ATCTCTCCCACCCGGCTCT AGGCC 



B18hum 
BISmouse 
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PCT QYTSOME-ASSOCXATED POLYPEPTIDE, HPCUPW PS SEQUENCE 
10 T^rnnTMG S ATD POLYPEPTIDE A N D THEIR USES IN TffE piAGNOgIS 
aun /QR THE TREATMENT OF LUNG INJURIES AND DI SEASES. AND OF 
OYTnATTVE STRESS -RELATED DISORDERS 

yield of the invention 

15 The present invention is related to a new 

peroxisome-associated polypeptide, the nucleotide sequence 
encoding said polypeptide and portions thereof as well as 
their uses in the diagnosis of several diseases, especially 
the diagnosis and/or the treatment of lung injuries and 

20 diseases, and of oxidative stress- related disorders. 

n ank ground of the inyenfriPA 

The peroxisomes are organelles nearly 
ubiquitous in eukaryotic cells. They contain enzymes 

25 essential for various catabolic and anabolic pathways. Some 
of these enzymes are expressed const i tut ively while others 
can be induced under appropriate conditions. Peroxisomes 
carry out a variety of essential reactions such as 
peroxisomal oxidation and respiration, fatty acid beta- 

30 oxidation, cholesterol and dolichol metabolism, ether- 
phospholipid synthesis, and glyoxylate and pipecolic acid 
metabolism. 
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The peroxisomal respiratory pathway is based 
upon the formation of hydrogen peroxide by a collection of 
oxidases and the decomposition of the H 2 0 2 by catalase. 
These reactions ' are responsible for 20% of oxygen 
5 consumption in liver, and several oxidases have been 
identified in peroxisomes. Ethanol elimination via catalase 
in peroxisomes may be significant in addition to the 
oxidation via cytosolic alcohol dehydrogenase. 

The peroxisomal beta-oxidation system 
10 catalyses the beta-oxidative chain shortening of a specific 
set of compounds which can not be handled by mitochondria : 
very long chain fatty acids, di- and trihydroxycholestanoic 
acids, pristanic acid, long chain dicarboxylic acids, 
several prostaglandins, several leukotrienes, 12- and 15- 
15 hydroxyeicosatetraeonic acid, and several mono- and 
polyunsaturated fatty acids, which are of direct diagnostic 
relevance for some peroxisomal disorders. 

Peroxisomes play also a major role in the 
synthesis of cholesterol and other isoprenoids. Fibroblasts 
20 from patients affected by disorders of peroxisome 
biogenesis show low capacity to synthesise cholesterol. 

Two enzyme activities responsible for 
introduction of the characteristic ether linkage in ether- 
linked phospholipids (dihydroacetonephosphate 
25 acyltransferase (DHAPAT) and alkyldihydroxyacetonephosphate 
synthase (alkyl-DHAP synthase)) are localised in 
peroxisomes. These enzymes are not yet cloned. As 
demonstrated by the identification of patients with 
deficiency of either DHAPAT or alkyl-DHAP synthase with 
30 severe clinical abnormalities, ether -phospholipids are of 
major importance in humans. 
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Peroxisomes are able to detoxify glyoxylate 
via alanine/glyoxylate aminotransferase. The deficiency of 
this cloned enzyme causes hyperoxaluria type I. 
L-pipecolate is a minor metabolite of L- lysine and is 
5 catabolised by the L-pipecolate oxidase localised in 
peroxisomes. The enzyme is deficient in cerebro-hepato- 
renal (Zellweger) syndrome. 

In human, the importance of peroxisomes was 
emphasised by a number of inherited diseases involving 
10 either a defect in the biogenesis of peroxisomes or a 
deficiency of one (or more) peroxisomal enzymes. So far, 12 
different peroxisomal disorders have been described and 

most of them are lethal. 

A wide variety of chemicals have been showed 
15 to produce peroxisome proliferation and induction of 
peroxisomal and microsomal fatty acids -oxidising enzymes 
activities in rats and mice. Several peroxisomes 
proliferators have been shown to increase the incidence of 
liver tumours in these species. Proposed mechanisms of 
20 liver tumour formation by peroxisomes proliferators include 
induction of sustained oxidative stress. 

Therefore, newly identified molecules 
associated with peroxisomes could be used for the 
development of diagnostic tools and possibly for the 
25 improvement of several therapeutical applications of 
various diseases associated with peroxisomal disorders . In 
addition, it is useful to identify the molecules present in 
specific organs like the lung and which may be used as 
specific markers of inflammatory diseases as well as lung 
30 injuries or diseases. 
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Summary of the invention 

The Inventors have isolated and purified a 
new sequence of a low molecular weight human broncho- 
alveolar polypeptide. Said mammal, preferably human, 
5 protein or polypeptide (hereafter identified as B18hum 
protein) has been sequenced and its corresponding genomic 
DNA (SEQ ID NO 8) and cDNA (SEQ ID NO 1) have been 
identified. Similarly, the corresponding nucleotide and 
amino acid sequence from a rat (SEQ ID NO 3 and 4) and from 
10 a mouse (SEQ ID NO 5 and 6) have been obtained. 

Said sequences present several homologies 
with other peroxisomal proteins of yeast and comprise a 
carboxy- terminal tripeptide SQL which is necessary for the 
specific targeting and translocation of several proteins 
15 into the peroxisome. 

Therefore, the present invention is related 
to a new isolated and purified polypeptide sequence having 
a amino acid sequence which presents more than 70% 
homology, advantageously more than 85% homology, more 
20 preferably more than 95% homology, with the amino acid 
sequence SEQ ID NO 2., Said amino acid sequence is 
advantageously obtained from a mammal, preferably from a 
rat, a mouse or a human. 

The present invention is also related to the 
25 isolated and purified polypeptide sequence corresponding to 
the amino acid sequence SEQ ID NO 2 or a portion thereof, 
preferably an immunoreactive portion (putative immunogenic 
domain or T or B cell epitopes) . 

Said portions are advantageously comprised 

30 between : 

- Glutamic acid position 13 - Glutamic acid position 27 

- Alanine position 26 - Leucine position 36 
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- Alanine position 42 - Glutamic acid position 57 

- Glutamic acid position 57 - Valine position 69 

- Valine position 80 - Leucine position 97 

- Arginine position 95 - Leucine position 112 
5 - Serine position 118 - Serine position 129 

- Valine position 137 - Threonine position 150 

Preferably, said portion has more than 10, 
20, 30, 50 or 70 amino acids. Specific portions of the 
amino acid sequence SEQ ID NO 2 are also portions of more 

10 than 70 amino acids which present at least 80% of the 
proteinic activity (see example 5) of the complete SEQ ID 
NO 2 sequence. Therefore, the amino acid sequence according 
to the invention can be partially deleted while maintaining 
its activity, preferably its ant i -oxidative activity, which 

15 will be described hereafter. 

According to the invention, the amino acid 
sequence SEQ ID NO 2 presents a pi of 7.16 and a molecular 
weight of 17047 Dalton as hereafter defined by 
bidimensional electrophoresis. 

20 The present invention is also related to the 

nucleotide sequence encoding the amino acid sequence 
according to the invention and its regulatory sequences 
upstream said coding sequence. A nucleotide sequence 
encoding the polypeptide according to the invention is a 

25 genomic DNA (see SEQ ID NO 10), a cDNA (see SEQ ID NO 1) or 
a mRNA, possibly comprising said upstream regulatory 
sequence. Advantageously, said nucleotide sequence presents 
more than 70%, advantageously more than 85%, more 
preferably more than 95% homology with SEQ ID NO 1 or its 

30 complementary strand. 
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According to a preferred embodiment of the 
present invention, said nucleotide sequence corresponds to 
the nucleotide sequence SEQ ID NO 1, its complementary 
strand or a portion thereof. 

5 "A portion of the nucleotide sequence SEQ ID 

NO 1" means any nucleotide sequence of more than 15 base 
pairs (such as a primer, a probe or an antisense nucleotide 
sequence) which allow the specific identification, 
reconstitution or blocking of the complete nucleotide 

0 sequence SEQ ID NO 1 (including its regulatory sequences 
upstream the coding sequence) . 

Said portions allow the specific 
identification, reconstitution or blocking by specific 
hybridisation with the nucleotidic sequence SEQ ID NO 1, 

5 preferably under standard stringent conditions, with 
sequences like probes or primers possibly labelled with a 
compound (radioactive compound, enzyme, fluorescent marker, 
etc.), and can be used in a specific diagnostic or dosage 
method like probe hybridisation (see Sambrook et al., §§ 

;0 9.47-9.51 in Molecular Cloning : A Laboratory Manual, Cold 
Spring Harbor, Laboratory Press, Cold Spring Harbor, New 
York (1989)), genetic amplification (like PCR (US patent 
4,683,195), LCR (Wu et al . , Genomics 4, pp. 560-569), CPR 
(US patent 5,011,769) ) . 

[5 Exemplary stringent hybridisation conditions 

are as follows : hybridisation at 42 °C in 50% formamide 5x 
SSC, 20 mM sodium phosphate, pH 6.8 washing in 0 . 2x SSC at 
55 °C. It is understood by those skilled in the art that 
variation of these conditions occur based on the length and 
JO GC nucleotide content of the sequence to be hybridised. 
Formulas standard in the art are appropriated for 
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determining exact hybridisation conditions (see Sambrook et 
al. 

Preferred examples of said nucleotide 

portions are as follows : 
5 Sequence Position 

5'-gccatcccagcagtggaggtgtttg-3' (SEQ ID NO 11) 217-241 
5'-ttgaacagctctgccaggttcacc-3' (SEQ ID NO 12) 261-284 
5'-tggaggtgtttgaaggggagccag-3' (SEQ ID NO 13) 230-253 
5'-caggttcaccttgttccctggctc-3' (SEQ ID NO 14) 247-270 
10 5'-gggtatgggactagctggcg-3' (SEQ ID NO 15) 33-52 

5<-ctggccaacattccaattgcag-3' (SEQ ID NO 16) 747-768 

and the sequences of respectively 601 (SEQ ID NO 8), 604 
(SEQ ID NO 9) and 469 (SEQ ID NO 7) base pairs 
corresponding to specific mKNA alternative splicing of the 
15 B18 human nucleotide sequence as described in Figure 4 (the 
known genomic sequence incorporating several introns and 
exons is represented in the sequence SEQ ID NO 10) . 

Said sequences may be used for a genetic 
amplification or a probe hybridisation as above-described. 
2Q The present invention is also related to a 

vector comprising the necessary elements for the injection, 
transfection or transduction of cells and having 
incorporated one or more of the nucleotide sequences 
according to the invention. The vector according to the 
25 invention is selected from the group consisting of viruses, 
plasmids, phagemides, cationic vesicles, liposomes or a 
mixture thereof. Said vector may comprise also one or more 
adjacent regulatory sequences (such as. promoter (s), 
secretion and termination signal sequence (s) ) , 
30 advantageously operably linked to the nucleotide sequence 
according to the invention. 
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The present invention is also related to the 
cell transformed by said vector and expressing the 
polypeptide according to the invention. 

The nucleotide sequence according to the 
5 invention can be also introduced in said cell by the 
formation of CaP0 4 -nucleic acid precipitate, DEAE-dextran- 
nucleic acid complex or by electroporation. 

Another aspect of the present invention is 
related to an inhibitor of the polypeptide according to the 
10 invention or the nucleotide sequence according to the 
invention (including the upstream sequences like promoter- 
operator regulatory sequence which may be inhibited by a 
cis- and/or transactivating repressor). Said inhibitor is 
advantageously an antibody or a fragment of said antibody 
15 such as an hypervariable portion of said antibody directed 
against the amino acid or nucleotide sequence of the 
polypeptide according to the invention. Other examples of 
inhibitors according to the invention are antisense 
nucleotide sequences which allow the blocking of the 
20 expression of the nucleotide sequence according to the 

invention. * 

Another aspect of the present invention is 

related to a diagnostic device (such as a diagnostic kit or 
a chromatographic column) comprising an element selected 

25 from the group consisting of the amino acid sequence of 
said polypeptide, its nucleotide sequence, and/or the 
inhibitor according to the invention or a fragment thereof 
as above-described. Said diagnostic device may comprise 
also necessary reactants and media for the diagnostic 

30 and/or dosage of the nucleotide and/or amino acid sequence 
of the polypeptide according to the invention, which are 
based upon the method selected from the group consisting of 
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in 



situ hybridisation, hybridisation by labelled 
antibodies, especially RIA (Radio Immuno Assay) or ELISA 
(Enzymes Linked Immuno-Sorbent Assay) technologies, 
detection upon filter, upon solid support, in solution, in 
5 sandwich, upon gel, dot blot hybridisation, Northern blot 
hybridisation, Southern blot hybridisation, isotopic or 
non-isotopic labelling (by immunofluorescence or 
biotinilised probes) , genetic amplification, (especially by 
PCR or LCR) , double immunodiffusion technique, counter- 
10 electrophoresis technique, haemagglutination or a mixture 
thereof . 

Another aspect of the present invention 
concerns a diagnosis method wherein a biological sample 
from the patient, such as cephalo-rachidian fluid, serum, 
15 blood, plasma, urine, broncho -alveolar lavage, stomach 
lavage, etc., is isolated from the patient, and is put in 
contact with the diagnostic device according to the 
invention for the diagnosis or the monitoring of an injury 
or a disease, preferably a lung injury or an oxidative 
20 stress-related disorder, affected by the presence of pro- 
oxidant agent or oxidative stress such as specific cardio- 
vascular diseases like arteriosclerosis, neurodegenerative 
disorders (Alzheimer's disease, Parkinson's disease, 
amyotrophic lateral sclerosis), apoptosis, inflammatory 
25 reactions, allergic reactions such as asthma, hay fever and 
eczema, high bone mass syndrome, osteopetrosis, 
osteoporosis-pseudoglioma syndrome, and Bardet-Biedl 
syndrome 1. Said diagnosis and monitoring upon one or more 
biological samples obtained from several tissues from the 
30 patient can be advantageously obtained by one or more of 
the methods above -described, which could be adapted 
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according to the specific biological sample by the person 
skilled in the art. 

Therefore, the product according to the 
invention could be used as a marker for the above - 
5 identified injuries, diseases or disorders in a broad 
spectrum of tissues as shown in the enclosed Figure 1. 

A further aspect of the present invention is 
related to a pharmaceutical composition comprising a 
pharmaceutical ly acceptable carrier and an element selected 
10 from the group consisting of the nucleotide sequence, the 
amino acid sequence of the polypeptide according to the 
invention, the inhibitor directed against said sequences 
and/or one or more portions thereof. 

A last aspect of the present invention is 
15 related to the use of the pharmaceutical composition 
according to the invention for the manufacture of a 
medicament in the treatment and/or the prevention of lung 
injuries and/or diseases or of oxidative stress-related 
disorders. 

20 The present invention is also related to a 

prevention and/or treatment method of a patient, especially 
a human patient, preferably affected by lung injuries 
and/or diseases or by oxidative stress -related disorders, 
wherein a sufficient amount of the pharmaceutical 

25 composition according to the invention is administered to 
said patient in order to treat, avoid and/or reduce the 
symptoms of said injuries and/or diseases. 

Other injuries and/or diseases which can be 
prevented and/or treated are injuries and/ or diseases 

30 affected by the presence of pro-oxidant agents or oxidative 
stress, such as specific cardio-vascular diseases like 
arteriosclerosis, neurodegenerative disorders such as 
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Alzheimer's disease, Parkinson's disease, amyotrophic 
lateral sclerosis, apoptosis and inflammatory reactions and 
some allergic reactions such as asthma, hay fever and 
eczema, high bone mass syndrome, osteopetrosis, 
5 osteoporosis-pseudoglioma syndrome, and Bardet-Biedl 
syndrome 1 . 

The pharmaceutical ly acceptable carrier 
according to the invention is any compatible non-toxic 
substance suitable for administering the composition 

10 according to the invention to a human patient. 
Pharmaceutically acceptable carriers according to the 
invention suitable for oral administration are the ones 
well known by the person skilled in the art, such as 
tablets, coated or non-coated pills, capsules, spray-gas, 

15 patches, gels, solutions or syrups. Pharmaceutically 
acceptable carriers vary according to the mode of 
administration (intravenous, intramuscular, subcutaneous, 
parenteral, etc.), and may comprise also adjuvants well 
known by the person skilled in the art to increase, reduce 

20 and/or regulate humoral, local and/or cellular response of 

the immune system. 

The pharmaceutical composition according to 
the invention may be prepared by the methods, generally 
applied by the person skilled in the art in the preparation 

25 of various pharmaceutical compositions, wherein the 
percentage of the active compound/pharmaceutically 
acceptable carrier can vary within very large ranges, only 
limited by the tolerance of the patient to said 
pharmaceutical composition, and wherein the limits are 

30 particularly determined by the frequency of administration 
and the possible side-effects of the active compounds or 
its pharmaceutically acceptable carrier . 
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Another aspect of the invention is related to 
the use of the diagnostic device according to the invention 
for performing upon the patient or upon a biological fluid 
obtained from the patient, a diagnosis, a dosage and/or a 
5 monitoring of the above-mentioned injuries or diseases or 
oxidative stress-related disorders affecting the patient. 

A further aspect of the present invention is 
related to a cell or a non-human animal, preferably a 
mammal such as a mouse or a rat, transformed by the vector 
10 according to the invention and overexpressing the 
polypeptide according to the invention, or a non-human 
animal, preferably a mammal such as a mouse or a rat, 
genetically modified by a partial or total deletion of its 
genomic sequence encoding the polypeptide according to the 
15 invention (knock-out non-human mammal) and obtained by 
methods well known by the person skilled in the art such as 
the one described by Kahn et al . (Cell, Vol. 92, pp. 593- 

596 (March 1998) ) . 

Other examples of genetically modified non- 
20 human animals according to the invention may be a 
transgenic non-human animal comprising an inhibitor 
according to the invention, preferably an antisense nucleic 
acid sequence complementary to the nucleotide sequence 
according to the invention so placed as to be transcribed 
25 into antisense mRNA which is complementary to the 
nucleotide sequence according to the invention and which 
hybridises to said nucleotide sequence, thereby reducing or 
blocking its translation. 

Further aspects of the present invention will 
30 be described in the enclosed non- limiting examples in 
reference to the following Figures. 
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Brief description of the drawings 



Figure 1 



5 Figure 2 



10 Figure 3 



15 



Figure 4 



Figure 5 



20 



represents a dot blot analysis of mRNA 
encoding the polypeptide according to the 
invention in various types of human tissues, 
represents a Northern blot analysis of mRNA 
encoding the polypeptide according to the 
invention in a rat lung after administration 
of lipopolysaccharides (LPS) inducing an 
inflammatory reaction of the lung, 
represents a Northern blot analysis of mRNA 
encoding the polypeptide according to the 
invention in a rat lung after intraperitoneal 
injection of pneumotoxicants . 

is a schematic representation of the human 
genomic sequence, the complete cDNA sequence 
and the corresponding amino acid sequence, 
represents respectively the alignment of the 
sequences of the human B18 polypeptide 
according to the invention with the 
corresponding rat and mouse sequences. 



Exam ple 1 : Homology between the BIB polypeptide 

according to the invention with othe r known 
nucleotide or amino acid sequ ences 
25 The BLAST 2.0 software (gapped BLAST at the 

NCBI Internet site) was used for searching for homologies 
between human B18 (162 amino acids) and known polypeptides 
in databases (GenBank, SwissProt) . Said search did not give 
perfect alignment with known peptides from different 
30 species (Table 1) . Homologies of the human B18 cDNA (805 
nucleotides) with GenBank, EMBL, DDBJ and PDB deposited 
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nucleotide sequences (Table 2) and GenBank Expression 
Sequence TAGS (ESTs) were noted. 
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Table 1 ; Homologies of the B18 proteins (162 amino 

acid) with other proteins 



Name 


NCBI ID 


Identity (%) 
Homology (%) 


Membrane protein 
(synechocystis sp.) 


1652859 


57/129(44%) 
81/129 (62%) 


Peroxisomal -like protein 
(Aspergillus fumigatus) 


2769700 


56/176(31%) 
90/176 (50%) 


Haein HI0572 hypothetical 
protein (Haemophilus 
influenzae) 


1723174 


53/146 (36%) 
80/146 (54%) 


PMP20 (Schizosaccharomyces 
pombe) 


AJ002536 


54/161(33%) 
85/161(52%) 


Peroxisomal membrane 
protein A (PMP 20) (Candida 
boidinii) 


130360 


59/170(34%) 
89/170 (51%) 


Peroxisomal membrane 
protein B (PMP 2 0) (Candida 
boidinii) 


130361 


58/170(34%) 
88/170 (51%) 


Putative peroxisomal 
protein PMP from yeast 
(Saccharomyces cerevisiae) 


1709682 


41/138 (29%) 
72/138 (51%) 


Alkylhydroper oxide . 
reductase C22 protein 
(Escherichia coli) 


P26427 


36/126(28%) 
58/126(45%) 



Table 2 



Name 


Access NO 


Identity 


Human mRNA down- regulated in 
cells infected by adenovirus 5 


U82616 


259/263 (98%) 


Human mRNA down- regulated in 
cells infected by adenovirus 5 


U82615 


300/321 (93%) 



5 
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In the Table 2, an identity of 98% has been 
obtained with the alignment of 259 nucleotides of CDNA B18, 
which comprises in its totality 805 nucleotides, with 263 
nucleotides of U82616 CDNA. A similar identity has been 

5 obtained with the U82615 sequence. 

The sequence SEQ ID NO 1 comprising 805 
nucleotides presents a homology with several EST sequences 
obtained from a human and from a mouse, having the 
following references : 

0 y«m»n -• 

AA130751, N42215, W38597, N91311, N68467, AA187737, 
N68916, W00593, R88950, AA181884, H20154, H66666 



AA220019, AA123351, AA087129, AA255021, AA249897, W71344 



5 



Tjfrsyi^le 2 : fi ffffW* defection 

A human RNA master Blot (Clontech) containing 
100-500 ng of poly-A + human RNA in each dot (normalised to 
the mRNA expression levels of eight different housekeeping 

10 genes) was hybridised with a 554 bp-long B18 probe labelled 
with 32 P, and quantified , using Phosphorimaging Technology. 
As shown in Figure 1, B18 mRNA is present in all tissues 
examined but predominantly in trachea, lung, kidney, 
thyroid gland, stomach, colon, heart and some regions of 

25 the brain. Highest expression has been noted in the thyroid 
tissue. This presence is probably correlated with the 
possible antioxidant activity of the B18 polypeptide 
according to the invention. 



30 fi yampla 3 ; Tnf lammatriT-y reaction 

Figure 2 represents a Northern blot analysis 
of rat lung mRNA after 6, 48 and 72 hours after 
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lipopolysaccharides (LPS) instillation inducing an 
inflammatory reaction in the lung. 

A Northern blot containing 15 ng of total RNA 
in each lane was hybridised with a 225 bp-long rat B18 
5 probe, stripped and reprobed with a 572 bp-long rat P-actin 
probe, both labelled with 32 P . Northern blot was quantified 
using Phosphorimaging Technology and the BIB mRNA data were 
normalised to p-actin mRNA level. 

10 ■frriT pi*' 4 5 ^wimqtpariff taction 

Figure 3 represents a Northern blot analysis 
of rat lung mRNA after intraperitoneal injection of 
pneumotoxicants (4-ipomeanol ,1- (3-fyryl) -4-hydroxypentanone 
(IPO) , methylcyclopentadienyl manganese tricarbonyl (MMT) 

15 or alpha naphtylthiourea (ANTU) ) . These agents are known to 
induce in the lung acute lesions of Clara (IPO) and 
alveolar cells (MMT) as well as increasing the permeability 
of the alveolar/blood barrier (ANTU). A Northern blot 
containing 15 M g of total RNA in each lane was hybridised 

20 with a 225 bp-long rat B18 probe, stripped and reprobed 
with a 572 bp-long P-act'in probe both labelled with 32p. 
The Northern blot- was quantified using Phosphorimaging 
Technology and rat B18 mRNA data were normalised to p-actin 
mRNA level. 

25 

^ ple 5 , pyftfcainle a^vit y of the B18 Polypeptide 

An amino analysis of the complete human B18 
amino acid sequence shows that said polypeptide presents 
specific portions showing an homology with other anti- 
30 oxidant enzymes (starting from a Leucine at position 36 
until a Cysteine at position 47) and an other portion 
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having an important homology with beta chains of ATP 
synthase (starting from a Glutamic acid at position 13 
until a Glycine in position 38) . 

Furthermore, the B18 amino acid sequence 
5 according to the invention shows an important homology with 
an Aspergillus fumigatus allergen (34% identity and 60% 
homology by using clustal V sequence alignment) , especially 
upon the portion of said B18 polypeptide having possible 
antioxidant properties. Therefore, it is possible that a 
10 peroxisomal protein (possibly homologous to B18 protein) is 
able to induce and to bind IgE from patients sensitised to 
Aspergillus fumigatus peroxisomal proteins after an 
induction of the patient immune system with Aspergillus 
fumigatus allergen. This mechanism can be compared to a 
15 reaction obtained with the manganese superoxide dismutase 
(MnSOD) wherein the human MnSOD is able to bind to IgE from 
patients sensitised to Aspergillus fumigatus MnSOD. 

Furthermore, the Inventors have identified a 
portion of the B18 human polypeptide which presents an 
20 homology with a Cyclophil in-binding domain of Candida 
boidinii PMP20 (receptor, of the immuno- suppressant drug 
cyclosporine A) . Said possible Cyciophilin-binding domain 
is starting from the Threonine in position 150 until the 
Leucine in position 161. 

25 

ttv^ ple 6 : «™ Vmi^n aen p and mRNA pi t *»m fl t r i ys spUcApq 

As represented in the enclosed Figure 4, the 
inventors have identified upon the genomic DNA (SEQ ID NO 
10) 5 exons and. 5 introns. By RT-PCR (using primers 5'- 
30 gggtatgggactagctggcg-3' and 5 < -ctggccaacattccaattgcag-3 • ) 
and according to the genomic sequence, 4 different cDNAs 
corresponding to the transcription of the said genomic DNA 
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A 



have been identified in human lung and in human brain 
first cDNA of 736 bp corresponds to the cDNA encoding the 
complete amino acid sequence of the B18 protein according 
to the invention. However, 3 other cDNAs of 601, 604 and 
5 469 bp were also identified, and comprise specific 
splicings of one or more exons. 

Therefore, another aspect of the present 
invention is related to said specific portions of the 
complete genomic or CDNA nucleotide sequence according to 
10 the invention or to specific portions of the complete amino 
acid sequence of the BIB protein according to the 
invention, which could be used also as specific markers of 
the B18 activity, preferably the ant i- oxidative activity. 

15 T^fnn pla 7 ; v^yfr-^it mouse 

Exons of a mouse genomic sequence encoding 
the B18 polypeptide according to the invention have been 
deleted by homologous recombination. Said homologous 
recombination has been obtained with a genetic sequence 
20 comprising a neomycin resistant gene. The targeting vector 
with said gene and a ,thymidine kinase (in order to 
eliminate non- homologous recombinants with ganciclovir) has 
been prepared. Said recombination was used for the deletion 
of one or more exons of the B18 polypeptide. After 
25 electroporation of ES cells with the targeting vector, 
positive clones having incorporated homologous 
recombination were identified by Southern blot with 
labelled probes. Aggregation of said positive clones with a 
morula from a Swiss pseudo- pregnant mouse produces several 
30 chimeric mice which survive after birth. Several homozygote 
mice are obtained by cross-breeding and are used as a model 
for the above-mentioned diseases. 
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Similar experiments may be done with another 
mammal whose B18 sequence is known (the B18 sequence of a 
mouse and a rat and their alignment with the human sequence 
is shown in the enclosed Figure 5) . 

5 

g^m ple 8 ; nhrnmngomfi .1 oral i pafcipp Qf hWBW P19 

Radiation hybrid clones (GeneBridge 4 
Radiation Hybrid Panel, Research Genetics) were used for 
performing chromosome localisation by PCR with two 
10 different pairs of primers (5< -caggttcaccttgttccctggctc-3 • 
(SEQ ID NO 14), 5'-atgttatgcaaccctttgcgacac-3' (SEQ ID NO 
17) and 5'-gtgtttgaaggggagccagggaac-3' (SEQ ID NO 18), 
5<-agagacagggtttcaccatcttgg-3' (SEQ ID NO 19)). 

The Inventors have located BIB genomic 
15 sequence on human chromosome llql3. B18 gene has been 
located 7.15-6.1 cR from marker D11S913 between markers 
D11S1963 and D11S4407 (Genome Database internet site) . 

Unknown genes linked to different disorders 
have been localised in the same region of chromosome 11. 
20 Therefore, B18 gene is possibly associated with these 
disorders: 

- atopy (atopic hypersensitivity: asthma, hay fever and 
eczema; MIM n° 147050 at OMIM of NCBI internet site) , 

- high bone mass syndrome (MIM n°601884) , 
25 - osteopetrosis (MIM n°259700) , 

- osteoporosis-pseudoglioma syndrome (MIM n°259770) and 

- Bardet-Biedl syndrome 1 (MIM n°20990l) . 
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CLAIMS 

1. Amino acid sequence having more than 70% 
homology with the sequence SEQ ID NO 2 . 

2. Amino acid sequence according to claim 1, 
5 having more than 85% homology with the sequence SEQ ID NO 

2. 

3. Amino acid sequence according to claim 1 
or 2, having more than 95% homology with the sequence SEQ 
ID NO 2. 

1Q 4 . Amino acid sequence according to any one 

of the preceding claims, corresponding to SEQ ID NO 2 or an 
immunoreactive portion thereof. 

5. Nucleotide sequence encoding the ammo 
acid sequence according to any one of the preceding claims 

X5 and presenting more than 70% homology with SEQ ID NO 1 or 
its complementary strand. 

6. Nucleotide sequence according to claim 5, 
having more than 85% homology with the sequence SEQ ID NO 1 
or its complementary strand. 

2Q 7 . Nucleotide sequence according to claim 5 

m ore than 95% homology wit^h the sequence SEQ ID NO 1 or its 

complementary strand. 

8. Nucleotide sequence according to any one 

of the claims 5 to 7, corresponding to the sequence SEQ ID 
25 NO 1, its complementary strand or a portion thereof 
specific for SEQ ID NO 1 and comprising more than 15 base 
pairs. 

9. Vector comprising the nucleotide sequence 
according to any one of the claims 5 to 8. 

3Q 10 . inhibitor directed against the amino acid 

or nucleotide sequence according to any one of the claims 1 



to 8. 



10 
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11. Inhibitor according to claim 10, being an 
antibody, preferably a monoclonal antibody, or a portion of 
said antibody. 

12. Diagnostic device comprising an element 
5 selected from the group consisting of the amino acid 

sequence according to any one of the claims 1 to 4, the 
nucleotide sequence according to any one of the claims 5 to 
8, the inhibitor according to claim 10 or 11, their 
portions or a mixture thereof. 

13. Method for the in vitro detection of lung 
injuries and diseases or oxidative stress-related diseases 
and disorders, especially inflammatory diseases, comprising 
the steps of : 

- isolating a sample from a body fluid of a patient, 
15 preferably a human patient, 

- possibly inhibiting the contaminants present in said 

sample, 

- put in contact said sample with an element selected from 
the group consisting of the amino acid sequence 

20 according to any one of the claims 1 to 4, the 
nucleotide sequence according to any one of the claims 5 
to 8, the inhibitor according to claim 10 or 11, their 
portions or a mixture thereof, and 
- detecting a reaction of a molecule present in said 

25 sample with said element. 

14. Pharmaceutical composition comprising a 
pharmaceutical^ acceptable carrier and an element selected 

^r,oi«Hna of the amino acid sequence 
from the group consisting or 

according tc any one of the claims 1 to 4, the nudeotide 
30 sequence according to any one of the claims 5 to 8, the 
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inhibitor according to claim 10 or 11. their portions or a 

mixture thereof. 

15. Use of the pharmaceutical composition 

according to claim 14 for the manufacture of a medicament 

5 for the prevention and/or the treatment of lung injuries or 

diseases, and of oxidative stress -related diseases or 

disorders, such as specific cardio-vascular diseases like 

arteriosclerosis, neurodegenerative disorders such as 

Alzheimer's disease, Parkinson's disease, amyotrophic 

10 lateral sclerosis, apoptosis and inflammatory reactions, 

allergic reactions such as asthma, hay fever and eczema, 

high bone mass syndrome, osteopetrosis, osteoporosis - 

pseudoglioma syndrome, and Bardet-Biedl syndrome 1. 

16. Cell transformed by the vector according 
15 to claim 9 or comprising a partial or total deletion of its 

nucleotide sequence according to any one of the claims 5 to 
8. 

17. Non-human animal, preferably a mammal, 
transformed by the vector according to claim 9 or 

20 comprising a partial or total deletion of its nucleotide 
sequence according to any, one of the claims 5 to 8 . 
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CLUSTAL V alignment of human and rat B18 amino acid sequences (Identity: 
90 Homology: 97.5%) : 



BIBbum 
BlBrat 



MA.PIKVGDAIPAVEVFEGEPGNKVNLAELFKGKKGVLFGVPGAFTPGCSK = SEQIDNOl 

MAPIKVGDTIPSVEVFEGEPGKKVNIAELFKDKKGVLFGVPGAFTPGCSK 

******** ** *********.♦***♦*♦**.****************** 



niBhtm THLPGFVEQAEALKAKGVQWACLSVNDAFVTGEWGRAHKAEGKVRLLAD 

1,1 ^ THLPGFVEQAGALKAKGAQVVACLSVNDVFVTAEWGRAHQAEGKVQLLAD FIG. 5a 

Bierat ;"*;;****** ****** ********** ***.******.*****.**** 



BIBhura 
BlBrat 



BIBhum 



PTGAFGKETDLLLDDS LVS I FGNRRLKRFSMWQDGI VKALNVEP DGTGL 
PTGAFGKETDLLLDDSLVSLFGNRRLKRFSMVIDKGWKALKVEPDGTGL 



TCSLAPNIISQL 



BlBrat TC ^!*i L *S* 

CLUSTAL V alignment of human and mouse B18 amino acid sequences (Identity: 
91%, Homology: 96%) : 

«i av,,™ MAPIKVGDAIPAVEVFEGEPGNKVNLAELFKGKKGVLFGVPGAFTPGCSK 
SemSSse MRPIKVGDAIPSVEVFEGEPGKKWLAELFKGKKGVL 

. hlm THLPGFVEQAEALKAKGVQVVACLSVNDAFVTGEWGRAHKAEGKVRLLAD 

nISse THLPGF^QAGAL^GA^ 

*1 Bh« m PTGAFGKETDLLLDDSLVSIFGNRRLKRFSMWQDGIVKALNVEPDGTGL 

!"Sse PTGAFGKATDLLLDDS LVSLFGNRRLKRFSMVI DNGI"VTCAijNVEPDGTGX« 



B18hum TCSLAPNIISQL 
B18mouse TC5LAPNILSQL 



CLUSTAL V alignment of human and rat cDNA sequences (identity: 612/780, 
78.5%) : 

*1 Bhrnn GCCAGGAGGCGGAGTGGAAGTGGCCGTGGGGCGGGTATGGGACTAGCTGG 

Bionuro CTAGGCAG 

BlBrat ™ **** * 

*n Ph„ m CGTGTGCGCCCTGAGACGCTCAGCGGGCTATATACTCGTCGGTGGGGCCG 

5!r^ CATA— GCC GGA— TCGGTGCTCCGTGCATCGGCTACTTGGAC-- 

B1Brat 7^; *** ** ******* * * ** - 

m Rhi.™ GCGGTCAGTCTGCGGCAGCGGCAGCAAGACGGTGCAGTGAAGGAGAGTGG 

l:V^ GTGCGTGGCAGGCAGAGCAGGCCGG AAAGGAGCAGGTTGG 
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Bl8hum 
B18rat 



GCGTCTGGCGGGGTCCGCAGTTTCAGCAGAGCCGCTGCAGCCATGGCCCC 
GAGT GT GGT GGGGCC C GCAGCTT CAGCAGTGCC GC GGT GAC TATGGC CC C 
* ** *** * + ** ****** ******** ***** * * ******** 



Bl8hum 
B18rat 



AATCAAGGT GGGAGATGCCAT CC CAGC AGTGGAGGTGTT T GAAGGGGAGC 
GATCAAGGTGGGAGACACCATTCCCTCAGTGGAGGTATTTGAAGGGGAAC 
************** **** ** ********** *********** * 



BlShum 
BlBrat 



CAGGGAACAAGGTGAACCT GGCAGAGCTGTTCAAGGGCAAGAAGGGT GT G 
CTGGAAAGAAGGT GAACTT GGCAGAGCTGTT CAAGGACAAGAAAGGT GTT 
* +* ** ********* ****************** ****** ***** 



B18hum 
B18rat 



CTGTTTGGAGTTCCTGGGGCCTTCACCCCTGGATGTTCCAAGACACACCT 
TTGTTTGGAGTCCCTGGGGCATTTACACCTGGCTGTTCCAAGACCCATCT 
********** ******** ** ** ***** *********** ** ** 



B18hum 
B18rat 



GCCAGGGTTTGTGGAGCAGGCTGAGGCTCTGAAGGCCAAGGGAGTCCAGG 
GCCTGGGTTTGTGGAGCAAGCCGGAGCTCTGAAGGCCAAGGGAGCACAAG 
*** ************** ** + ******************* ** * 



BlShum 
B18rat 



TGGTGGCCTGTCTGAGTGTTAATGATGCCTTTGTGACTGGCGAGTGGGGC 
TGGTGGCCTGTCTGAGTGTTAATGATGTCTTCGTGACTGCAGAGTGGGGT 
*************************** *** ******* ******** 



Bl8hum 
B18rat 



CGAGCCCACAAGGCGGAAGGCAAGGTTCGGCTCCTGGCTGATCCCACTGG 
CGAGCCCACCAGGCAGAAGGCAAGGTT CAGCT C CT GGCTGACCCCACTGG 
********* **** ************* ************ ******** 



B18hum 
B18rat 



GGCCTTTGGGAAGGAGACAGACTTATTACTAGATGATTCGCTGGTGTCCA 
AGCTTTTGGAAAGGAGACAGATTTACTACTAGATGATTCTT TGGT GTCT C 
** ***** *********** **+ ************* ******* 



B18hum 
BlBrat 



TCTTT GGGAAT CGACGTCT CAAGAGGTTCT C CATGGT GGTACAGGATGGC 
TCTTTGGGAATCGTCGGCTAAAAAGGTTCTCCATGGTGATAGACAAGGGC 
************* ** ** ** *************** ** * * *** 



BIBhum 
B18rat 



ATAGTGAAGGCCCTGAATGTGGAACCAGATGGCACAGGCCTCACCTGCAG 
GTAGTAAAGGCACTGAACGTGGAGCCGGATGGCACAGGCCTCACCTGCAG 
*+** ***** ***** ***** ** ******************+**** 



BlShum 
Bl8rat 



CCTGGCACCCAATATCATCTCACAGCTCTGAGGCCCTGGGCCAGATTACT 

CCTGGCCCCCAACAT CCTCTCACAACT CTGAGGCCCTGA- CCAGA AT G 
****** ***** *** ******* ************* ***** * 



B18hum 
BlBrat 



TCCTCCACCCCTCCCTATCTCACCTGCCCAGCCCTGTGCTGG-GGCCCTG 
TCCTCTGACTCTCCC-ATCTCCTCCACCCAGCTCTGGGCCAAAGGCCCAG 
***** * ***** ***** * ****** *** ** ***** * 



BlShum 
BlBrat 



CA ATTGGAATG TTGGCCAGATTTCTGC 

TACCT CCTTACCTGAGGGCCACT GGAATGGAAC CTTGACAATATTT CTGC 

* ******* *** * + ******** 



Bl8hum 
B18rat 



AATAAACACTTGTGGTTTGCGGAAAAAAA 

AATAAACAGTT -TAATTTGT GAAAAAAAAAAAAAAAAAAAAA 
******** ** + **** * ******* 



SUBSTITUTE SHEET (RULE 26) 
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CLUSTAL V alignment of human and mouse cDNA sequences (Identity: 552/675, 
81.8%) : 



B18hum 
BlBmouse 



FIG* 5c 

GCCAGGAGGCGGAGTGGAAGTGGCCGTGGGGCGGGTATGGGACTAGCTGG 



B18hum 
BISmouse 



CGTGTGCGCCCTGAGACGCTCAGCGGGCTATATACTCGTCGGTGGGGCCG 

TGCTCCGTG CATCGACGTGCTTG 

**** * * * *** * * * 



B18hum 
B18mouse 



GCGGTCAGT CT GC GGCAGCGGCAGCAAGACGGT GCAGTGAAGGAGAGT GG 

GCAGGCAG AGCAGGCCGG AAAGAAGCAGGTTGG 

** * *** 



»-*** * *** + **** * + ** 



B18hum 
B18mouse 



B18hum 
BlBmouse 



GCGTCTGGCGGGGTCCGCAGTTTCAGCAGAGCCGCTGCAGCCATGGCCCC 
GAGTGTGGCGGAGCCCGCAGCTTCAGCAGCTCCGCGGTGACCATGGCCCC 
* ** ****** + ****** ******«■+ **** ■+ ********** 

AATCAAGGTGGGAGATGCCATCCCAGCAGTGGAGGTGTTTGAAGGGGAGC 
GATCAAGGTGGGAGATGCCATTCCCTCAGTGGAGG7AT7TGAAGGGGAAC 



B18hum 
BlBmouse 



CAGGGAACAAGGTGAACCTGGCAGAGCTGTTCAAGGGCAAGAAGGGTGTG 
CGGGAAAGAAGGTGAACTTGGCAGAGCTGTTCAAGGGCAAGAAAGGTGTT 



BIBhum 
BlBmouse 



CTGTTTGGAGTTCCTGGGGCCTTCACCCCTGGATGTTCCAAGACACACCT 
TTGTTTGGAGTCCCTGGGGCATTTACACCTGGCTGTTCTAAGACCCACCT 



BIBhum 
B18mouse 



GCCAGGGTTTGTGGAGCAGGCT GAGGCT CTGAAGGCCAAGGGAGTCCAGG 
GCCTGGGTTTGTGGAGCAAGCTGGAGCTCTGAAGGCTAAGGGAGCGCAGG 



BlShum 
BlBmouse 



BIBhum 
BlBmouse 



TGGTGGCCTGTCTGAGTGTTAATGATGCCTTTGTGACTGGCGAGTGGGGC 
T GGTGGCCTGTCTGAGCGTTAATGACGT CTTT GT GATTGAAGAGT GGGGT 
**************** ******** * *♦***■ 



* + ******** 



CGAGCCCACAAGGCGGAAGGCAAGGTTCGGCTCCTGGCTGATCCCACTGG 
CGAGCCCACCAGGCAGAAGGCAAGGTTCGGCTCCTGGCTGACCCCACTGG 



BIBhum 
BlBmouse 

B18hum 
BlBmouse 



BIBhum 
B18mouse 



B18hum 
Bl8mouse 



BIBhum 
BlBmouse 



BIBhum 
BlBmouse 



ggcctttgggaaggagJicagacttattactagatgattcgctggtgtcca 

AGCCTTT GGGAAGGCGACAGACTTATTATTGGAT GATT CTTTGGT GTCT C 
************* ************* + ******** ******* 

TCTTTGGGAATCGACGTCTCAAGAGGTTCTCCATGGTGGTACAGGATGGC 
T-TTTGGGAATCGTCGGCTGAAAAGGTTCTCCATGGTGATAGACAACGGC 
I™******** * + ** ** *************** ** * * *** 

A m AGTGAAGGCCCTGAATGTGGAACCAGATGGCACAGGCCTCACCTGCAG 
A^AGT GAAGGCACTGAACGT GGAGCCAGAT GG CACAGGCCT CACCT GCAG 

************ ***** ***** ************************** 



CCTGGCACC CAATAT CATCTCACAGCT CT GAGGC CCT ' 
rrTGGCCCCCAACATCCTCTCCCAACTCTGAGGCCCT' 



CCTGGCCCCCAACATCCTC: 
****** ***** *** **** ** ** 



TGGGCCAGATTACT 
G-CCAGATG 



t*****w»* ****** 



TCCTCCACCCCTCCCTATCTCACCTGCCCAGCCCTGTGCTGGGGCCCTGC 

TC-TCTGACTCTCCC-ATCTCTCCCACCCGGCTCT AGGCC 

****** * ***** ***** ** *** ** ** **** 

AATT GGAATGTT GGCCAGATTT CT GCAATAAAC ACTT GT GGTTT GCGGAA 

AAAAGGCT CGGT ACCT CCTT ACTGGGAGC - CACGT 

* * + * * ++ r * * ** 
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(1) GENERAL INFORMATION : 

(i) APPLICANT: 

(A) NAME: UNIVERSITE CATHOLIQUE DE LOUVAIN 

Halles Universitaires 

(B) STREET: Place de 1' Universite, 1 

(C) CITY: LOUVAIN- LA-NEUVE 

(E) COUNTRY: BELGIUM 

(F) POSTAL CODE (ZIP) : B-1348 

(A) NAME: UNIVERSITE DE MONS-HAINAUT 

(B) STREET: Place du Pare 20 

(C) CITY: MONS 

(E) COUNTRY: BELGIUM 

(F) POSTAL CODE (ZIP): B-7000 

(iL) TITLE OF INVENTION: PEROXISOME- ASSOCIATED PEPTIDE, NUCLEOTIDE 
SEQUENCE ENCODING SAID PEPTIDE'AND THEIR USES IN THE 
DIAGNOSTIC AND/OR THE TREATMENT OF LUNG INJURIES AND 
DISEASES, AND OF OXIDATIVE STRESS -RELATED DISORDERS 

(iii) NUMBER OF SEQUENCES: 19 

(iv) COMPUTER READABLE FORM: 

(A) MEDIUM TYPE: Floppy disk 

(B) COMPUTER: IBM PC compatible 

(C) OPERATING SYSTEM: PC-DOS/MS-DOS 

(D) SOFTWARE: Patentln Release #1.0, Version #1.30 (EPO) 



(21 INFORMATION FOR SEQ ID NO: 1: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 805 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(iii) HYPOTHETICAL: NO 

(iv) ANTI-SENSE: NO 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/ KEY: CDS 

(B) LOCATION : 193 . .681 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 1: 
GCCAGGAGGC GGAGTGGAAG TGGCCGTGGG GCGGGTATGG GACTAGCTGG CGTGTGCGCC 
CTGAGACGCT CAGCGGGCTA TATACTCGTC GGTGGGGCCG GCGGTCAGTC TGCGGCAGCG 



60 
120 
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GCAGCAAGAC GGTGCAGTGA AGGAGAGTGG GCGTCTGGCG GGGTCCGCAG TTTCAGCAGA 180 

GCCGCTGCAG CC ATG GCC CCA ATC AAG GTG GGA GAT GCC ATC CCA GCA 228 
Met Ala Pro lie Lys Val Gly Asp Ma lie Pro Ala 
15 10 

GTG GAG GTG TTT GAA GGG GAG CCA GGG AAC AAG GTG AAC CTG GCA GAG 27 6 

Val Glu Val Phe Glu Gly Glu Pro Gly Asn Lys Val Asn Leu Ala Glu 
15 20 25 

CTG TTC AAG GGC AAG AAG GGT GTG CTG TTT GGA GTT CCT GGG GCC TTC 324 
Leu Phe Lys Gly Lys Lys Gly Val Leu Phe Gly Val Pro Gly Ala Phe 
30 35 40 

ACC CCT GGA TGT TCC AAG ACA CAC CTG CCA GGG TTT GTG GAG CAG GCT 372 
Thr Pro Gly Cys Ser Lys Thr His Leu Pro Gly Phe Val Glu Gin Ala 
45 50 55 60 

GAG GCT CTG AAG GCC AAG GGA GTC CAG GTG GTG GCC TGT CTG AGT GTT 420 
Glu Ala Leu Lys Ala Lys Gly Val Gin Val Val Ala Cys Leu Ser Val 
65 70 ' 75 

AAT GAT GCC TTT GTG ACT GGC GAG TGG GGC CGA GCC CAC AAG GCG GAA 468 
Asn Asp Ala Phe Val Thr Gly Glu Trp Gly Arg Ala His Lys Ala Glu 
80 85 90 

GGC AAG GTT CGG CTC CTG GCT GAT CCC ACT GGG GCC TTT GGG AAG GAG 516 
Gly Lys Val Arg Leu Leu Ala Asp Pro Thr Gly Ala Phe Gly Lys Glu 
95 100 105 

ACA GAC TTA TTA CTA GAT GAT TCG CTG GTG TCC ATC TTT GGG AAT CGA 564 
Thr Asp Leu Leu Leu Asp Asp Ser Leu Val Ser lie Phe Gly Asn Arg 
110 115 120 

CGT CTC AAG AGG TTC TCC ATG GTG GTA CAG GAT GGC ATA GTG AAG GCC 612 
Arg Leu Lys Arg Phe Ser Met Val Val Gin Asp Gly He Val Lys Ala 
125 130 135 140 

CTG AAT GTG GAA CCA GAT GGC ACA GGC CTC ACC TGC AGC CTG GCA CCC 660 
Leu Asn Val Glu Pro Asp Gly Thr Gly Leu Thr Cys Ser Leu Ala Pro 
145 150 155 

AAT ATC ATC TCA CAG CTC TGA GGCCCTGGGC CAGATTACTT CCTCCACCCC 711 
Asn He He Ser Gin Leu * 
160 

TCCCTATCTC ACCTGCCCAG CCCTGTGCTG GGGCCCTGCA ATTGGAATGT TGGCCAGATT 771 
TCTGCAATAA ACACTTGTGG TTTGCGGAAA AAAA 805 



(2) INFORMATION FOR SEQ ID NO: 2: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 163 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 2: 
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Met Ala Pro He Lys Val Gly Asp Ala He Pro Ala Val Glu Val Phe 
15 10 15 

Glu Gly Glu Pro Gly Asn Lys Val Asn Leu Ala Glu Leu Phe Lys Gly 
20 25 30 

Lys Lys Gly Val Leu Phe Gly Val Pro Gly Ala Phe Thr Pro Gly Cys 
35 40 45 

Ser Lys Thr His Leu Pro Gly Phe Val Glu Gin Ala Glu Ala Leu Lys 
50 55 60 

Ala Lys Gly Val Gin Val Val Ala Cys Leu Ser Val Asn Asp Ala Phe 
65 70 75 80 

Val Thr Gly Glu Trp Gly Arg Ala His Lys Ala Glu Gly Lys Val Arg 
85 90 95 

Leu Leu Ala Asp Pro Thr Gly Ala Phe Gly Lys Glu Thr Asp Leu Leu 
100 105 110 

Leu Asp Asp Ser Leu Val Ser He Phe Gly Asn Arg Arg Leu Lys Arg 
115 120 125 

Phe Ser Met Val Val Gin Asp Gly He Val Lys Ala Leu Asn Val Glu 
130 135 140 

Pro Asp Gly Thr Gly Leu Thr Cys Ser Leu Ala Pro Asn He He Ser 
145 150 155 160 

Gin Leu * 



(2) INFORMATION FOR SEQ ID NO: 3: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 780 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(iii) HYPOTHETICAL: NO 

(iv) ANTI-SENSE: NO 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Rattus Rattus 

(ix) FEATURE: 

(A) NAME /KEY: CDS 

(B) LOCATION: 136 . .624 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 3: 
TGCGTCCTAG GCAGCATAGC CGGATCGGTG CTCCGTGCAT CGGCTACTTG GACGTGCGTG 
GCAGGCAGAG CAGGCCGGAA AGGAGCAGGT TGGGAGTGTG GTGGGGCCCG CAGCTTCAGC 



60 
120 
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AGTGCCGCGG- 


TGACTATGGC CCCGATCAAG GTGGGAGACA 


CCATTCCCTC 


AGTGGAGGTA 


180 


TTTGRAGGGG 


AACCTGGAAA GAAGGTGAAC TTGGCAGAGC 


TGTTCAAGGA 


CAAGAAAGGT 


240 


GTTTTGTTTG 


GAGTCCCTGG GGCATTTACA CCTGGCTGTT 


CCAAGACCCA 


TCTGCCTGGG 


300 


TTTGTGGAGC 


AAGCCGGAGC TCYGAAGGCC AAGGGAGCAC AAGTGGTGGC 


CTGTCTGAGT 


360 


GTTAATGATG 


YCTTCGTGAC TGCAGAGTGG GGTCGAGCCC 


ACCAGGCAGA 


AGGCAAGGTT 


420 


CAGCTCCTGG 


CTGACCCCAC TGGAGCTTTT GGAAAGGAGA 


CAGATTTACT 


ACTAGATGAT 


480 


TCTTTGGTGT 


CTCTCTTTGG GAATCGTCGG CTAAAAAGGT 


TCTCCATGGT 


GATAGACAAG 


540 


GGCGTAGTAA AGGCACTGAA CGTGGAGCCG GATGGCACAG 


GCCTCACCTG 


CAGCCTGGCC 


600 


CCCAACATCC 


TCTCACAACT CTGAGGCCCT GACCAGAATG 


TCCTCTGACT 


CTCCCATCTC 


660 


CTCCACCCAG 


CTCTGGGCCA AAGGCCCAGT ACCTCCTTAC 


CTGAGGGCCA 


CTGGAATGGA 


720 


ACCTTGACAA 


TATTTCTGCA ATAAACAGTT TAATTTGTGft. AAAAAAAAAA AAAAAAAAAA 


780 



(2) INFORMATION FOR SEQ ID NO: 4: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH : 162 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(iii) HYPOTHETICAL: NO 

(iv) ANTI-SENSE: NO 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Rattus Rattus 

(ix) FEATURE : 

(A) NAME /KEY: Modif ied-site 
(D) LOCATION: 17 

<D) OTHER INFORMATION :/product= M Glu or Gly" 

(ix) FEATURE: 

(A) NAME/ KEY: Modif ied-site 

(B) LOCATION: 63 

(D) OTHER INFORMATION : / product™ "Leu or Pro" 

(ix) FEATURE: 

(A) NAME/KEY: Modif ied-site 

(B) LOCATION: 79 

(D) OTHER INFORMATION : /product™ "Ala or Val" 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 4: 

Met Ala Pro lie Lys Val Gly Asp Thr He Pro Ser Val Glu Val Phe 
15 10 15 
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Xaa Gly Glu Pro Gly Lys Lys Val Asn Leu Ala Glu Leu Phe Lys Asp 
20 25 30 

Lys Lys Gly Val Leu Phe Gly Val Pro Gly Ala Phe Thr Pro Gly Cys 
35 40 45 

Ser Lys Thr His Leu Pro Gly Phe Val Glu Gin Ala Gly Ala Xaa Lys 
50 55 60 

Ala Lys Gly Ala Gin Val Val Ala Cys Leu Ser Val Asn Asp Xaa Phe 
65 70 75 80 

Val Thr Ala Glu Trp Gly Arg Ala His Gin Ala Glu Gly Lys Val Gin 
85 90 95 

Leu Leu Ala Asp Pro Thr Gly Ala Phe Gly Lys Glu Thr Asp Leu Leu 
100 105 110 

Leu Asp Asp Ser Leu Val Ser Leu Phe Gly Asn Arg Arg Leu Lys Arg 
115 120 125 

Phe Ser Met Val He Asp Lys Gly Val Val Lys Ala Leu Asn Val Glu 
130 135 140 

Pro Asp Gly Thr Gly Leu Thr Cys Ser Leu Ala Pro Asn lie Leu Ser 
145 150 155 160 

Gin Leu 



(2) INFORMATION FOR SEQ ID NO: 5: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 675 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(iii) HYPOTHETICAL: NO 

(iv) ANTI-SENSE: NO 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Mouse 

(ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: 99. .588 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 5: 

TGCTCCGTGC ATCGACGTGC TTGGCAGGCA GAGCAGGCCG GAAAGAAGCA GGTTGGGAGT 60 

GTGGCGGAGC CCGCAGCTTC AGCAGCTCCG CGGTGACCAT GGCCCCGATC AAGGTGGGAG 120 

ATGCCATTCC CTCAGTGGAG GTATTTGAAG GGGAACCGGG AAAGAAGGTG AACTTGGCAG 180 

AGCTGTTCAA GGGCAAGAAA GGTGTTTTGT TTGGAGTCCC TGGGGCATTT ACACCTGGCT 240 
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GTTCTAAGAC CCACCTGCCT GGGTTTGTGG AGCAAGCTGG AGCTCTGAAG GCTAAGGGAG 300 

CGCAGGTGGT GGCCTGTCTG AGCGTTAATG ACGTCTTTGT GATTGAAGAG TGGGGTCGAG 360 

CCCACCAGGC AGAAGGCAAG GTTCGGCTCC TGGCTGACCC CACTGGAGCC TTTGGGAAGG 420 

CGACAGACTT ATTATTGGAT GATTCTTTGG TGTCTCTCTT TGGGAATCGT CGGCTGAAAA 4 80 

GGTTCTCCAT GGTGATAGAG AACGGCATAG TGAAGGCACT GAACGTGGAG CCAGATGGCA 540 

CAGGCCTCAC CTGCAGCCTG GCCCCCAACA TCCTCTCCCA ACTCTGAGGC CCTGGCCAGA 600 

TGTCCTCTGA CTCTCCCATC TCTCCCACCC GGCTCTAGGC CAAAAGGCTC GGTACCTCCT 660 

TACTGGGAGC CACGT 675 
(2) INFORMATION FOR SEQ ID NO: 6: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 162 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(iii) HYPOTHETICAL: NO 

(iv) ANTI-SENSE: NO 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Mouse 



<xi) SEQUENCE DESCRIPTION: SEQ ID NO: 6: 

Met Ala Pro lie Lys Val Gly Asp Ala He Pro Ser Val Glu Val Phe 
15 10 15 

Glu Gly Glu Pro Gly Lys Lys Val Asn Leu Ala Glu Leu Phe Lys Gly 
20 25 30 

Lys Lys Gly Val Leu Phe Gly Val Pro Gly Ala Phe Thr Pro Gly Cys 
35 40 45 

Ser Lys Thr His Leu Pro Gly Phe Val Glu Gin Ala Gly Ala Leu Lys 
50 55 60 

Ala Lys Gly Ala Gin Val Val Ala Cys Leu Ser Val Asn Asp Val Phe 
65 70 75 80 

Val He Glu Glu Trp Gly Arg Ala His Gin Ala Glu Gly Lys Val Arg 
85 90 95 

Leu Leu Ala Asp Pro Thr Gly Ala Phe Gly Lys Ala Thr Asp Leu Leu 
100 105 110 

Leu Asp Asp Ser Leu Val Ser Leu Phe Gly Asn Arg Arg Leu Lys Arg 
115 120 125 
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Phe Ser Met Val He Asp Asn Gly He Val Lys Ala Leu Asn Val Glu 
130 135 140 

Pro Asp Gly Thr Gly Leu Thr Cys Ser Leu Ala Pro Asn He Leu Ser 
145 150 155 160 



Gin Leu 



(2) INFORMATION FOR SEQ ID NO: 7: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 469 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(iii) HYPOTHETICAL: NO 

(iv) ANTI-SENSE: NO 

(vi) ORIGINAL SOURCE: 

{A| ORGANISM: Homo sapiens 



(ix) FEATURE : 

(A) NAME /KEY: CDS 

(B) LOCATION: 161. .382 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 7: 

GGGTATGGGA CTAGCTGGCG TGTGCGCCCT GAGACGCTCA GCGGGCTATA TACTCGTCGG 60 

TGGGGCCGGC GGTCAGTCTG CGGCAGCGGC AGCAAGACGG TGCAGTGAAG GAGAGTGGGC 120 

GTCTGGCGGG GTCCGCAGTT TCAGCAGAGC CGCTGCAGCC ATGGCCCCAA TCAAGGTTCG 180 

GCTCCTGGCT GATCCCACTG GGGCCTTTGG GAAGGAGACA GACTTATTAC TAGATGATTC 240 

GCTGGTGTCC ATCTTTGGGA ATCGACGTCT CAAGAGGTTC TCCATGGTGG TACAGGATGG 300 

CATAGTGAAG GCCCTGAATG TGGAACCAGA TGGCACAGGC CTCACCTGCA GCCTGGCACC 360 

CAATATCATC TCACAGCTCT GAGGCCCTGG GCCAGATTAC TTCCTCCACC CCTCCCTATC 420 

TCACCTGCCC AGCCGTGTGC TGGGGCCCTG CAATTGGAAT GTTGGCCAG 469 
(2) INFORMATION FOR SEQ ID NO: 8: 

- (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 601 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 
(iii) HYPOTHETICAL: NO 
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(iv) ANTI-SENSE: NO 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/ KEY: CDS 

(B) LOCATION: 161. .514 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 8: 

GGGTATGGGA CTAGCTGGCG TGTGCGCCCT GAGACGCTCA GCGGGCTATA TACTCGTCGG 60 

TGGGGCCGGC GGTCAGTCTG CGGCAGCGGC AGCAAGACGG TGCAGTGAAG GAGAGTGGGC 120 

GTCTGGCGGG GTCCGCAGTT TCAGCAGAGC CGCTGCAGCC ATGGCCCCAA TCAAGACACA 180 

CCTGCCAGGG TTTGTGGAGC AGGCTGAGGC TCTGAAGGCC AAGGGAGTCC AGGTGGTGGC 240 

r 

CTGTCTGAGT GTTAATGATG CCTTTGTGAC TGGCGAGTGG GGCCGAGCCC ACAAGGCGGA 300 

AGGCAAGGTT CGGCTCCTGG CTGATCCCAC TGGGGCCTTT GGGAAGGAGA CAGACTTATT 360 

ACTAGATGAT TCGCTGGTGT CCATCTTTGG GAATCGACGT CTCAAGAGGT TCTCCATGGT 420 

GGTACAGGAT GGCATAGTGA AGGCCCTGAA TGTGGAACCA GATGGCACAG GCCTCACCTG 480 

CAGCCTGGCA CCCAATATCA TCTCACAGCT CTGAGGCCCT GGGCCAGATT ACTTCCTCCA 540 

CCCCTCCCTA TCTCACCTGC CCAGCCCTGT GCTGGGGCCC TGCAATTGGA ATGTTGGCCA 600 



(2) INFORMATION FOR SEQ ID NO: 9: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 604 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(iii) HYPOTHETICAL: NO 

(iv) ANTI-SENSE: NO 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/ KEY: CDS 

(B) LOCATION: 161. .517 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 9: 
GGGTATGGGA CTAGCTGGCG TGTGCGCCCT GAGACGCTCA GCGGGCTATA TACTCGTCGG 



60 
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TGGGGCCGGC GGTCAGTCTG CGGCAGCGGC AGCAAGACGG TGCAGTGAAG GAGAGTGGGC 120 

GTCTGGCGGG GTCCGCAGTT TCAGCAGAGC CGCTGCAGCC ATGGCCCCAA TCAAGGTGGG 180 

AGATGCCATC CCAGCAGTGG AGGTGTTTGA AGGGGAGCCA GGGAACAAGG TGAACCTGGC 240 

AGAGCTGTTC AAGGGCAAGA AGGGTGTGCT GTTTGGAGTT CCTGGGGCCT TCACCCCTGG 300 

ATGTTCCAAG GTTCGGCTCC TGGCTGATCC CACTGGGGCC TTTGGGAAGG AGACAGACTT 360 

ATTACTAGAT GATTCGCTGG TGTCCATCTT TGGGAATCGA CGTCTCAAGA GGTTCTCCAT 420 

GGTGGTACAG GATGGCATAG TGAAGGCCCT GAATGTGGAA CCAGATGGCA CAGGCCTCAC 480 

CTGCAGCCTG GCACCCAATA TCATCTCACA GCTCTGAGGC CCTGGGCCAG ATTACTTCCT 540 

CCACCCCTCC CTATCTCACC TGCCCAGCCC TGTGCTGGGG CCCTGCAATT GGAATGTTGG 600 

CCAG 604 
(2) INFORMATION FOR SEQ ID NO: 10: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2710 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(iii) HYPOTHETICAL: NO 

(iv) ANTI-SENSE: NO 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/ KEY: exon 

(B) LOCATION: 2516. .2710 

(ix) FEATURE: 

(A) NAME/ KEY: exon 

(B) LOCATION: 2074. .2135 

(ix) FEATURE: 

(A) NAME/ KEY: exon 

(B) LOCATION: 1932. .1970 

(ix) FEATURE: 

(A) NAME/ KEY: exon 

(B) LOCATION: 1728. .1859 

(ix) FEATURE: 

(A) NAME/ KEY: exon 

(B) LOCATION: 802 . .936 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 10: 
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TCTGTCCCTT AGCGCCCCCG CGGGGGCTTA CCCCATCCCA CTCCATGACC TCCCCTCCCC 60 

CCATGGCGAA TTCCCACCTT TCTGTCTTTC ACTCACTTCC TGGAACCGTC CCCAGGGCCT 120 

TGGACCTTCC CCCTTCTCCT CCCAAACCTT GTGAGACCCC ATTCCCTTTC TACTTCATCC 180 

TGCTCTCAAC TTTTGGGCTC CTCAGAGGCC CTCACCCCTG ACTCTCTCTC CCTACCCACT 240 

CTGGTCCCAT GAAGCCCTCA AGTACTCTGG GGATGGATCC TTCCCCCTTC AAAAGATTCC 300 

TTCTTTTGTT CTACACCTCC TGGGTGTAGG GGCCTGGACA CCCTCCCCCA ACGTTCCACC 360 

TGCCGCTGCC CTTCCTCTTC CTCCTCCTGA GGGTGGGACC CTCAGACCTG GCCAAGATCC 420 

TCTCCCTCCA TGTTGTCAGG GACTCCTCCT CACCCCCAAA TACAGCCCTC TAGCCCCTGT 4 80 

CCATTTTATT CCACTCCTTT CCTGTAACCT AGACAGCATG TTATGCAACC CTTTGCGACA 540 

CATGGGGAAA CCTTCCCTCC CTTCCTCTGT TGTCACCAAT GGCCCCTTAA GAGGAGCAGG 600 

GCCACCTTGA AACTTGGAGG ATATGGGGTA ACCCAGTGGfe AGCGGGCAGG GAGGGCCCTT 660 

GGAAACTGAC AGGGCTGGAG TATCCTGCTG GGT7TCAGCC CCGGTTCCTG CAGGCACAGC 720 

TGCCAGGCTC TCTGTTCACC TTCCTGCCTC TGGTTTGCCC CGGCTCCCTC ACCCCCCTTA 7 80 

CCCTGGAGTC CTTCCTTCTA GGTGGGAGAT GCCATCCCAG CAGTGGAGGT GTTTGAAGGG 840 

GAGCCAGGGA ACAAGGTGAA CCTGGCAGAG CTGTTCAAGG GCAAGAAGGG TGTGCTGTTT 900 

GGAGTTCCTG GGGCCTTCAC CCCTGGATGT TCCAAGGTGA GGCCCTTCCC CTTCTGAAGA 960 

TCAGGACCTG GGGATCTTTT GTGTTGCTCT TAAGTCCTCC ACATAGTCCT GATAGGACTC 1020 

CTAAAAAGCA TTTCAGTGCC ATCACAAAAC AAGTAGAGCT GGGTAGAGCT GGGCGCGGTG 1080 

GCTCACGCCT GTAATCCCAG CACTTTGGGA GGCCAAGGCG GGTGGATCAC GAGGTCAGGA 1140 

GTCCAAAACC AGCCTGGCCA AGATGGTGAA ACCCTGTCTC TACTAAAAAT GCAAAAAAAT 1200 

CAGCCGGATA TGGTGGCGGG CGCCTGTAAT CCCAGGTATT GGGGAGGCTG AGGCAGAGAA 1260 

TTGCTTGAAC CCAGGAGGCG TAGGTTGCAG TGAGTGGAGA TCGTGCCTCT GCAGTCCAGC 1320 

CTGGGTGAAA GAGCGAGACT CCGTCTCAAA ATGAAAAAAA AAAAAGAAAA CAAGTAGAGA 1380 

CTGCAAAAAG GGAACAGTAC CGGGAATGTT GGAGAAAAAC ATACTACAAT TAAATCCAAC 14 40 

ACCCCTGTTG GTCCTGCTAA ATGACAGGCA CTGTGGAAGG TGCTTGGGAC TCAGATAAAT 1500 

AAGACAAAGA TCTGCCCATG GAAAGTTCAC GTCTGGACCA TAAGGCATTA GGTTTCATTC 1560 

TGAGCTTCCT AGTGGCCAAG GCAAAAAGGA AATAGAATGG TTTAGACAGC TCTCATTGTC 1620 

TGATCAAAGG TGTTGAGGCA GAGCACTGAG GAGGGCCTGG AGATAAAGGG TGGGCTGGGG 168 0 

GTCAGATGCA GTTATCCCTT TGCCGACCCT TTGTTCCCCT TCCTCAGACA CACCTGCCAG 1740 

GGTTTGTGGA GCAGGCTGAG GCTCTGAAGG CCAAGGGAGT CCAGGTGGTG GCCTGTCTGA 1800 

GTGTTAATGA TGCCTTTGTG ACTGGCGAGT GGGGCCGAGC CCACAAGGCG GAAGGCAAGG 18 60 
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TGAGGTGAGG GGCCTGCAGG GAGTCAGGAC CAGGTAGGAT ATTCTTCTTG TGACCTCTAC 1920 

TTTCTCTGCA GGTTCGGCTC CTGGCTGATC CCACTGGGGC CTTTGGGAAG GTGAGTGTTC 1980 

CCCTGACCGC CACAGGGACA TGGCGGTGCG GGGAGCAGTG GGGGCCCTTG GCCTCTTCAA 2040 

GGATTTCTGA CACTTTTCTC TGTCTCTTCT TAGGAGACAG ACTTATTACT AGATGATTCG 2100 

CTGGTGTCCA TCTTTGGGAA TCGACGTCTC AAGAGGTAAA AGTGGAGAGT CCTCTGTGGA 2160 

GAAAGTCCTC TGTGGGAGAG AGTCCTCTGT GGGAGAGAGT CCTCTGTGGA GAGGGTCCTC 2220 

TGTGGGAAGA GTCGTCTGTG GGGGAGATGT GTGGGAGAGA GTCCTGTGTG GGGAGAGTCT 2280 

TCTGTAGGGG AGAGTCCTCT GGGGAGAGAG TCCTGTGTGG GGGAGAGTCC TCTGTGGGGA 2340 

GAGTCCTCTG TGTGGAGAGA GTCCTGTGTG GTGGTGAGTC CTCTGTGGGG GAGAGTCCTC 2400 

TGTGGGGGGA GTCCTCTCTG GAGTTCTCTT GGGCCCCTGG CTGTTCACTG CCTGTCTCCA 2460 

t 

TGCCCAGCCT CCAAGCCCAG GCTGATGCAG CTGGCTGGGC CCCTCTTTCC GGCAGGTTCT 252 0 

CCATGGTGGT ACAGGATGGC ATAGTGAAGG CCCTGAATGT GGAACCAGAT GGCACAGGCC 2580 

TCACCTGCAG CCTGGCACCC AATATCATCT CACAGCTCTG AGGCCCTGGG CCAGATTACT 2640 

TCCTCCACCC CTCCCTATCT CACCTGCCCA GCCCTGTGCT GGGGCCCTGC AATTGGAATG 2700 

TTGGCCAGAT 2710 
(2) INFORMATION FOR SEQ ID NO: 11: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 25 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

<xi) SEQUENCE DESCRIPTION: SEQ ID NO: 11: 
GCCATCCCAG CAGTGGAGGT GTTTG 25 
(2) INFORMATION FOR SEQ ID NO: 12: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 24 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: DNA (genomic) 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 12: 
TTGAACAGCT CTGCCAGGTT CACC 
(2) INFORMATION FOR SEQ ID NO: 13: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 24 base pairs 

(B) TYPE: nucleic acid 
{C} STRANDEDNESS: single 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 13: 
TGGAGGTGTT TGAAGGGGAG CCAG 
(2) INFORMATION FOR SEQ ID NO: 14: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 24 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 14: 
CAGGTTCACC TTGTTCCCTG GCTC 
(2) INFORMATION FOR SEQ ID NO: 15: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 15: 
GGGTATGGGA CTAGCTGGCG 
(2) INFORMATION FOR SEQ ID NO: 16: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 22 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 
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(D} TOPOLOGY: linear 
(ii) MOLECULE TYPE: DNA (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 16: 
CTGGCCAACA TTCCAATTGC AG 
(2) INFORMATION FOR SEQ ID NO: 17: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 24 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 17: 
ATGTTATGCA ACCCTTTGCG ACAC 
(2) INFORMATION FOR SEQ ID NO: 18: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 24 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 18: 
GTGTTTGAAG GGGAGCCAGG GAAC 
(2) INFORMATION FOR SEQ ID NO: 19: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 24 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 19: 
AGAGACAGGG TTTCACCATC TTGG 



24 



